New Build, 1 GPU folded a week and now fails

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Locked
s/j
Posts: 16
Joined: Sun Sep 20, 2009 2:53 pm

New Build, 1 GPU folded a week and now fails

Post by s/j »

O wise ones: I have a new build, folded gretat for a week, now one of the GPUs (Zotac GTX 780 fails folding evrytime.
I have tried swaping PCIE16 connectors as well as pausing folding, deleting work folder and re-booting.
Using MSI Afterburner, from day 1 the GPUs have been underclocked 120 mhz and GPU temps are consistently in the low 70's C.
The Zotac feels a lot hotter than the EVGA. That's likely to be expected.
Build is AMD FX-8350, 16 GB RAM, Gigabyte990FXA-UD3 Motherboard, 1000w Rosewill PSU. GPU 1 is EVGA GTX 970, GPU 2 is Zotac GTX 780 OC. Driver Version is 9.18.13.4411. (344.16)
Here is some log on a GPU failure:

Code: Select all

Log Started 2014-10-06T19:46:01Z ***********************
19:46:02:WU03:FS01:0x17:Project: 13001 (Run 236, Clone 3, Gen 11)
19:46:02:WU03:FS01:0x17:Unit: 0x0000001d538b3db7532892a3432b10e4
19:46:02:WU03:FS01:0x17:CPU: 0x00000000000000000000000000000000
19:46:02:WU03:FS01:0x17:Machine: 1
19:46:02:WU03:FS01:0x17:Reading tar file state.xml
19:46:02:WU03:FS01:0x17:Reading tar file system.xml
19:46:03:WU03:FS01:0x17:Reading tar file integrator.xml
19:46:03:WU03:FS01:0x17:Reading tar file core.xml
19:46:03:WU03:FS01:0x17:Digital signatures verified
19:46:03:WU03:FS01:0x17:Folding@home GPU core17
19:46:03:WU03:FS01:0x17:Version 0.0.52
19:49:10:WU02:FS00:0xa4:Completed 390000 out of 500000 steps  (78%)
19:49:55:WU03:FS01:0x17:ERROR:exception: Force RMSE error of 453.966 with threshold of 5
19:49:55:WU03:FS01:0x17:Saving result file logfile_01.txt
19:49:55:WU03:FS01:0x17:Saving result file log.txt
19:49:55:WU03:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
19:49:56:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
19:49:56:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13001 run:236 clone:3 gen:11 core:0x17 unit:0x0000001d538b3db7532892a3432b10e4
19:49:56:WU03:FS01:Uploading 2.26KiB to 140.163.4.231
19:49:56:WU03:FS01:Connecting to 140.163.4.231:8080
19:49:56:WU03:FS01:Upload complete
19:49:56:WU03:FS01:Server responded WORK_ACK (400)
Any thoughts?
It will suck if I have to RMA new Zotac GTX 780.
-Ed

Mod edit: added Code tags to log
Breach
Posts: 205
Joined: Sat Mar 09, 2013 8:07 pm
Location: Brussels, Belgium

Re: New Build, 1 GPU folded a week and now fails

Post by Breach »

Windows 11 x64 / 5800X@5Ghz / 32GB DDR4 3800 CL14 / 4090 FE / Creative Titanium HD / Sennheiser 650 / PSU Corsair AX1200i
s/j
Posts: 16
Joined: Sun Sep 20, 2009 2:53 pm

Re: New Build, 1 GPU folded a week and now fails

Post by s/j »

Thanks Breach. Will check this out.
Locked