Failing units, low ppd, and returned units.

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

Scarlet-Tech
Posts: 37
Joined: Tue Nov 10, 2015 9:54 pm

Re: Failing units, low ppd, and returned units.

Post by Scarlet-Tech »

bcavnaugh wrote:
7im wrote:FYI, the disease preference while configurable in the client, is currently only a place holder for a feature to be implemented at the server level at a later time. The preference currently has no affect on work unit assignment.
If I recall you said this two years ago as well!
I really hope they haven't been sitting on "features" for 2 years.

If that is true, it explains a lot.
Scarlet-Tech
Posts: 37
Joined: Tue Nov 10, 2015 9:54 pm

Re: Failing units, low ppd, and returned units.

Post by Scarlet-Tech »

mmonnin wrote:'Stock' as in from EVGA or actual stock from NV? Anything over what NV recommends is an overclock.

No failed core 21 WUs on my 970.
Stock, as in Nvidia specifications as noted earlier.. Plus, the memory is underclocked, as mentioned earlier. The work units still fail, so I guess that also explains a little.
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Failing units, low ppd, and returned units.

Post by 7im »

Scarlet-Tech wrote:
bcavnaugh wrote:
7im wrote:FYI, the disease preference while configurable in the client, is currently only a place holder for a feature to be implemented at the server level at a later time. The preference currently has no affect on work unit assignment.
If I recall you said this two years ago as well!
I really hope they haven't been sitting on "features" for 2 years.

If that is true, it explains a lot.
Sitting? OHN. Re-writing the servers' coding to improve WU handling, hardware matching, handle more Assignment and Work Servers, new FAHCores, and add new features already in the client. Yep.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
ChristianVirtual
Posts: 1596
Joined: Tue May 28, 2013 12:14 pm
Location: Tokyo

Re: Failing units, low ppd, and returned units.

Post by ChristianVirtual »

bcavnaugh wrote:So I guess I can stop testing now as it seems pointless now.
Might be "pointless" but always meaningful :eugeek:

There are times I can't watch the rigs and switch to full; Once I have more time I switch to exciting client types. You have the choice.
ImageImage
Please contribute your logs to http://ppd.fahmm.net
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failing units, low ppd, and returned units.

Post by bruce »

Stop testing until there's a new version of Core_21 that closes a number of open issues.

When you DECIDE you want Core_21 assignments, add "advanced" As I understand it, all Core_21 projects are now Adv (or even further removed from general circulation). If one slips through, whether it fails or not, let us know what project it is.
Kebast
Posts: 386
Joined: Thu Aug 06, 2015 5:21 pm

Re: Failing units, low ppd, and returned units.

Post by Kebast »

I don't think my 750ti has failed once. I'll turn on advanced there and keep up with it. My 970 has failed too many to chance it there.
Image
Ryzen 5900x 12T - RTX 4070 TI
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Failing units, low ppd, and returned units.

Post by Rel25917 »

Work unit count and ppd both are meaningless if half the projects are failing. The point is focusing just on completed units is kind of pointless due to the size difference in projects.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: Failing units, low ppd, and returned units.

Post by _r2w_ben »

Has anyone with an nVidia GPU that fails units frequently run memtestCL or memtestG80 for hours? It would be interesting to see if it detects any memory issues.
Kebast
Posts: 386
Joined: Thu Aug 06, 2015 5:21 pm

Re: Failing units, low ppd, and returned units.

Post by Kebast »

_r2w_ben wrote:Has anyone with an nVidia GPU that fails units frequently run memtestCL or memtestG80 for hours? It would be interesting to see if it detects any memory issues.
Brief read through tells me that's not available for 64bit Windows. That correct?
Image
Ryzen 5900x 12T - RTX 4070 TI
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failing units, low ppd, and returned units.

Post by bruce »

Windows 64-bit can run 32-bit programs. The only real question is whether either or both do a good job of stressing the VRAM and reporting errors.

You'll find both an older version and a newer version, and even the newer one was written before Maxwell was developed so report your findings (in a new topic, please).
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failing units, low ppd, and returned units.

Post by bruce »

Kebast wrote:I don't think my 750ti has failed once. I'll turn on advanced there and keep up with it. My 970 has failed too many to chance it there.
Mine, too. The error rate for GM20x chips is high. For GM1xx and earlier, most of the problems don't exist, so run your 750 Ti on Adv.
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Failing units, low ppd, and returned units.

Post by toTOW »

Scarlet-Tech wrote:
bruce wrote:At this time, the GPU VRAM clock rate seems to be more important than the Core clock-rate.l
Bcavnaugh is showing that the VRAM has been lowered from stock 7000mhz to 6000mhz and lower.. Well lower than stock speeds and even lower than last generation speeds.
Be careful : most Maxwell2 GPUs run in P2 mode when running FAH, but most overclocking tool (MSI Afterburner, ...) changes only the P0 mode clocks.

You need to use NV Inspector to check which mode is used while running FAH (on my 980M, it's P0, but it's P2 on my 980). Then, you have to change the memory clocks in the right mode.

Some other weird things might occur : on my 980, I have to change memory clocks for P2 tab, but GPU clicks in P0 tab.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bcavnaugh
Posts: 147
Joined: Tue Apr 30, 2013 1:39 pm

Re: Failing units, low ppd, and returned units.

Post by bcavnaugh »

Round Two, Second Error on the GTX 980 HC Rig.

22:43:43:WU00:FS03:0x18:Bad State detected... attempting to resume from last good checkpoint
22:46:21:WU00:FS03:0x18:Completed 11520000 out of 16000000 steps (72%)
22:49:54:WU00:FS03:0x18:Completed 11680000 out of 16000000 steps (73%)
22:53:27:WU00:FS03:0x18:Completed 11840000 out of 16000000 steps (74%)
22:57:00:WU00:FS03:0x18:Completed 12000000 out of 16000000 steps (75%)
22:57:02:WU00:FS03:0x18:Bad State detected... attempting to resume from last good checkpoint
22:57:02:WU00:FS03:0x18:Max number of retries reached. Aborting.
22:57:02:WU00:FS03:0x18:ERROR:exception: Max Retries Reached
22:57:02:WU00:FS03:0x18:Saving result file logfile_01.txt
22:57:02:WU00:FS03:0x18:Saving result file log.txt
22:57:02:WU00:FS03:0x18:Folding@home Core Shutdown: BAD_WORK_UNIT
22:57:03:WARNING:WU00:FS03:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
22:57:03:WU00:FS03:Sending unit results: id:00 state:SEND error:FAULTY project:9412 run:32 clone:2 gen:118 core:0x18 unit:0x000000a5ab40413a5535e1c84336fab9
22:57:03:WU00:FS03:Uploading 3.23KiB to 171.64.65.58
22:57:03:WU00:FS03:Connecting to 171.64.65.58:8080
22:57:03:WU00:FS03:Upload complete
22:57:03:WU00:FS03:Server responded WORK_ACK (400)
22:57:03:WU00:FS03:Cleaning up


Round Two, Third Error on the GTX 980 HC Rig.

03:47:56:WU00:FS02:0x21:Completed 390400 out of 640000 steps (61%)
03:49:34:WU00:FS02:0x21:Completed 396800 out of 640000 steps (62%)
03:50:48:WU00:FS02:0x21:Bad State detected... attempting to resume from last good checkpoint
03:50:48:WU00:FS02:0x21:Max number of retries reached. Aborting.
03:50:48:WU00:FS02:0x21:ERROR:Max Retries Reached
03:50:48:WU00:FS02:0x21:Saving result file logfile_01.txt
03:50:48:WU00:FS02:0x21:Saving result file log.txt
03:50:48:WU00:FS02:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
03:50:49:WARNING:WU00:FS02:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:50:49:WU00:FS02:Sending unit results: id:00 state:SEND error:FAULTY project:9704 run:69 clone:8 gen:69 core:0x21 unit:0x0000005cab404162553ec69157ec2330
03:50:49:WU00:FS02:Uploading 3.14KiB to 171.64.65.98
03:50:49:WU00:FS02:Connecting to 171.64.65.98:8080
03:50:49:WU00:FS02:Upload complete
03:50:49:WU00:FS02:Server responded WORK_ACK (400)
03:50:49:WU00:FS02:Cleaning up
US Army Retired | Folding@EVGA The Number One Team in the Folding@Home Community.
bcavnaugh
Posts: 147
Joined: Tue Apr 30, 2013 1:39 pm

Re: Failing units, low ppd, and returned units.

Post by bcavnaugh »

I will start the Third Round only on the GTX 980 HC Rig as it is the only one receiving Core 21 Projects that are failing.
Should be sometime late afternoon. I will Drop the Memory down 2000 MHz and start back up.
I am still getting Core 21 UNKNOWN_ENUM P9704 Projects without the Client Type Set.
US Army Retired | Folding@EVGA The Number One Team in the Folding@Home Community.
bcavnaugh
Posts: 147
Joined: Tue Apr 30, 2013 1:39 pm

Re: Failing units, low ppd, and returned units.

Post by bcavnaugh »

ChristianVirtual wrote:
bcavnaugh wrote:So I guess I can stop testing now as it seems pointless now.
Might be "pointless" but always meaningful :eugeek:

There are times I can't watch the rigs and switch to full; Once I have more time I switch to exciting client types. You have the choice.
That is why I will be starting Round 3 late This Afternoon.
US Army Retired | Folding@EVGA The Number One Team in the Folding@Home Community.
Post Reply