What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

Trying to get my quad 295's all folding

kruzn4evr

Well-known member
Joined
Apr 13, 2008
Messages
1,800
Location
Ajax, Ontario
So I'm convinced its not the gpu's, I can get 3 cores folding but as soon as I go for 4 I get the same thing but it's not always the same core that doesn't work, I've now had all 4 cores show the same message and it's always after I have 3 cores running. The only thing that is constant is the "read packet limit" and I'm wondering if that has more to do with it, if so, what can I do to rectify it?
 

Attachments

kruzn4evr

Well-known member
Joined
Apr 13, 2008
Messages
1,800
Location
Ajax, Ontario
I'm trying to figure out why this last core manages to start working and then shuts down. As you can see above, it gets started, 1% completed and then "mdrun returned" "NANs detected on GPU"...anyone know what this means?....I'd really like to get this figured out and as I said, I'm convinced it isn't the GPU's.
 

Prof. Dr. Silver

Well-known member
Joined
Nov 2, 2007
Messages
1,183
Location
Toronto, ON
Hey Kruzn,

Here's how I have my 9800GX2 setup:



Try leaving out the -advmethods and -forcegpu nvidia_g80 of your shortcuts and rather put them in the actual client on the parameter line.... see if it makes a difference?

Also, what drivers are you on? You're still using the VGA dummies?
 

kruzn4evr

Well-known member
Joined
Apr 13, 2008
Messages
1,800
Location
Ajax, Ontario
I'm using the 190.62 drivers at the moment, I'll give that a try, it just doesn't make sense to me why 3 of 4 cores will work (regardless of the core #) and the 4th starts working then I get the mdrun_gpu and NAN's thing then it shuts down. Anyhow, thanks, I'll give it a shot later on this afternoon :thumb:
 

chrisk

Folding Captain
Joined
Jul 12, 2008
Messages
7,541
Location
GTA, Ontario
All mine are set up like the Doc's...
Are you and the Doc using 4 GPU clients, with no dummy plug or second monitor? You say you are not using the -forcegpu flag...I thought that flag was required if you were not to use a dummy plug or second monitor, and have SLI enabled (as per my post here:http://www.hardwarecanucks.com/forum/hardwarecanucks-f-h-team/22481-folding-sli-enabled-no-dummy-plug-second-monitor.html )

I don't run 4 gpus on a machine to be able to help here a whole lot. I have only ever had to deal with two clients.

On that note, does anyone know if it is possible for there to be a wonky core or such on a card and have the SLI to work still? Is it possible that when kruzn games, he is gaming on three cores and not four?

If it is possible that a core is not working, or your setup prevents you from using 4 cores at a time for some reason (if the problem moves from cores 3 and 4), can you run a temperature testing program to track the temps of all four cores, and then try a game with SLI enabled.

My theory is that if he enables SLI, and plays a game, and all four GPUs work, the temps should rise on all GPUs...no? If he games, and the temps stay the same on one core, but rises on the other three, then one core is not being engaged?
 

Prof. Dr. Silver

Well-known member
Joined
Nov 2, 2007
Messages
1,183
Location
Toronto, ON
I only use two GPU clients for now.... my 9800GX2. Internal SLI is off and I noticed with the new 190.62 drivers.... I still need the dummy plug :(

The only parameter I use is the -advmethods. You want to make sure that all client ID's are different... other than that.... dunno yet :(
 

chrisk

Folding Captain
Joined
Jul 12, 2008
Messages
7,541
Location
GTA, Ontario
I only use two GPU clients for now.... my 9800GX2. Internal SLI is off and I noticed with the new 190.62 drivers.... I still need the dummy plug :(

The only parameter I use is the -advmethods. You want to make sure that all client ID's are different... other than that.... dunno yet :(
Have you tried the forcegpu flag I mentioned in the post? If you use that, as per my post with the latest drivers, no dummy plug is needed. I am not using one, and I am folding two GPUs...
 

kruzn4evr

Well-known member
Joined
Apr 13, 2008
Messages
1,800
Location
Ajax, Ontario
Are you and the Doc using 4 GPU clients, with no dummy plug or second monitor? You say you are not using the -forcegpu flag...I thought that flag was required if you were not to use a dummy plug or second monitor, and have SLI enabled (as per my post here:http://www.hardwarecanucks.com/forum/hardwarecanucks-f-h-team/22481-folding-sli-enabled-no-dummy-plug-second-monitor.html )

I don't run 4 gpus on a machine to be able to help here a whole lot. I have only ever had to deal with two clients.

On that note, does anyone know if it is possible for there to be a wonky core or such on a card and have the SLI to work still? Is it possible that when kruzn games, he is gaming on three cores and not four?

If it is possible that a core is not working, or your setup prevents you from using 4 cores at a time for some reason (if the problem moves from cores 3 and 4), can you run a temperature testing program to track the temps of all four cores, and then try a game with SLI enabled.

My theory is that if he enables SLI, and plays a game, and all four GPUs work, the temps should rise on all GPUs...no? If he games, and the temps stay the same on one core, but rises on the other three, then one core is not being engaged?
All 4 cores are fine, I've tested them, temps are equal on all 4 cores at first until the 4th core stops folding, keep in mind they are WC'd, I really don't think the cores are the issue at all. The problem happens on all cores, which ever core is the last one of the 4 to be started is the one that has the problem. As I said, it starts to fold, gets 1-2% completed then the "mdrun_gpu returned" and "NANs detected on GPU"...does anyone know what this means?? I am also wondering if the "read packet limit 540015616... set to 524286976" is what's stopping it from continuing, in other words, is my ISP (Rogers) limiting me?

Anyhow, I'll try a few other things and see what happens, the worst case scenario is I'll run 3 cores for appor. 19-21k PPD instead of 4 and then start working on getting the CPU folding.
 

Latest posts

Twitter

Top