Re: Bad work units on NV GPU slot

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
Nick200
Posts: 86
Joined: Fri Jun 20, 2014 6:40 am
Location: New Zealand

Re: Bad work units on NV GPU slot

Post by Nick200 »

Hi Folding folks

Just want to report that my folding machines are failing on a large number of WUs for some reason. They are all on GPU slots.

They are all on different versions of Windows 10, but all run 7.4.4 (tried the later version but had too many problems with my multiple GPU set ups)and have the latest nVidia driver ( 375.57). The main problem on one machine relates to a GTX 780 - but its companion 980 has had similar problems - it's just not showing that at the moment.

I observe for each machine frequent downloads of WUs, then the machine discarding them and trying another until finally the GPU slot fails.

I attach the log file from one of them:

Code: Select all

*********************** Log Started 2016-10-23T06:33:49Z ***********************
06:34:23:WU00:FS01:Connecting to 171.67.108.45:80
06:34:26:WU00:FS01:Assigned to work server 140.163.4.244
06:34:26:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 980] from 140.163.4.244
06:34:26:WU00:FS01:Connecting to 140.163.4.244:8080
06:34:28:WU00:FS01:Downloading 2.77MiB
06:34:30:WU00:FS01:Download complete
06:34:30:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13500 run:2 clone:419 gen:28 core:0x21 unit:0x000000268ca304f457a359cb20d62cbd
06:34:30:WU00:FS01:Starting
06:34:30:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/nickm/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 8256 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
06:34:30:WU00:FS01:Started FahCore on PID 15704
06:34:31:WU00:FS01:Core PID:5532
06:34:31:WU00:FS01:FahCore 0x21 started
06:34:32:WU00:FS01:0x21:*********************** Log Started 2016-10-23T06:34:32Z ***********************
06:34:32:WU00:FS01:0x21:Project: 13500 (Run 2, Clone 419, Gen 28)
06:34:32:WU00:FS01:0x21:Unit: 0x000000268ca304f457a359cb20d62cbd
06:34:32:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
06:34:32:WU00:FS01:0x21:Machine: 1
06:34:32:WU00:FS01:0x21:Reading tar file core.xml
06:34:32:WU00:FS01:0x21:Reading tar file system.xml
06:34:32:WU00:FS01:0x21:Reading tar file integrator.xml
06:34:32:WU00:FS01:0x21:Reading tar file state.xml
06:34:33:WU00:FS01:0x21:Digital signatures verified
06:34:33:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
06:34:33:WU00:FS01:0x21:Version 0.0.17
06:34:40:WU00:FS01:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
06:34:40:WU00:FS01:0x21:Saving result file logfile_01.txt
06:34:40:WU00:FS01:0x21:Saving result file log.txt
06:34:40:WU00:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
06:34:43:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:34:43:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13500 run:2 clone:419 gen:28 core:0x21 unit:0x000000268ca304f457a359cb20d62cbd
06:34:43:WU00:FS01:Uploading 2.54KiB to 140.163.4.244
06:34:43:WU00:FS01:Connecting to 140.163.4.244:8080
06:34:44:WU00:FS01:Upload complete
06:34:44:WU00:FS01:Server responded WORK_ACK (400)
06:34:44:WU00:FS01:Cleaning up
06:35:05:WU00:FS02:Connecting to 171.67.108.45:80
06:35:06:WU00:FS02:Assigned to work server 140.163.4.244
06:35:06:WU00:FS02:Requesting new work unit for slot 02: READY gpu:1:GK110 [GeForce GTX 780] from 140.163.4.244
06:35:06:WU00:FS02:Connecting to 140.163.4.244:8080
06:35:08:WU00:FS02:Downloading 2.54MiB
06:35:11:WU00:FS02:Download complete
06:35:11:WU00:FS02:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:10490 run:232 clone:0 gen:444 core:0x18 unit:0x000001ff8ca304f45537e902f17f7939
06:35:11:WU00:FS02:Starting
06:35:11:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/nickm/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18.exe -dir 00 -suffix 01 -version 704 -lifeline 8256 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
06:35:11:WU00:FS02:Started FahCore on PID 15424
06:35:11:WU00:FS02:Core PID:3376
06:35:11:WU00:FS02:FahCore 0x18 started
06:35:13:WU00:FS02:0x18:*********************** Log Started 2016-10-23T06:35:12Z ***********************
06:35:13:WU00:FS02:0x18:Project: 10490 (Run 232, Clone 0, Gen 444)
06:35:13:WU00:FS02:0x18:Unit: 0x000001ff8ca304f45537e902f17f7939
06:35:13:WU00:FS02:0x18:CPU: 0x00000000000000000000000000000000
06:35:13:WU00:FS02:0x18:Machine: 2
06:35:13:WU00:FS02:0x18:Reading tar file core.xml
06:35:13:WU00:FS02:0x18:Reading tar file system.xml
06:35:13:WU00:FS02:0x18:Reading tar file integrator.xml
06:35:13:WU00:FS02:0x18:Reading tar file state.xml
06:35:13:WU00:FS02:0x18:Digital signatures verified
06:35:13:WU00:FS02:0x18:Folding@home GPU core18
06:35:13:WU00:FS02:0x18:Version 0.0.4
06:35:28:WU00:FS02:0x18:Completed 0 out of 5000000 steps (0%)
06:35:28:WU00:FS02:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
06:37:00:WU00:FS02:0x18:ERROR:exception: Error downloading array posq: clEnqueueReadBuffer (-5)
06:37:00:WU00:FS02:0x18:Saving result file logfile_01.txt
06:37:00:WU00:FS02:0x18:Saving result file log.txt
06:37:00:WU00:FS02:0x18:Folding@home Core Shutdown: BAD_WORK_UNIT
06:37:03:WARNING:WU00:FS02:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:37:03:WU00:FS02:Sending unit results: id:00 state:SEND error:FAULTY project:10490 run:232 clone:0 gen:444 core:0x18 unit:0x000001ff8ca304f45537e902f17f7939
06:37:03:WU00:FS02:Uploading 2.66KiB to 140.163.4.244
06:37:03:WU00:FS02:Connecting to 140.163.4.244:8080
06:37:04:WU00:FS02:Upload complete
06:37:04:WU00:FS02:Server responded WORK_ACK (400)
06:37:04:WU00:FS02:Cleaning up
06:37:23:WU00:FS02:Connecting to 171.67.108.45:80
06:37:25:WU00:FS02:Assigned to work server 171.67.108.105
06:37:25:WU00:FS02:Requesting new work unit for slot 02: READY gpu:1:GK110 [GeForce GTX 780] from 171.67.108.105
06:37:25:WU00:FS02:Connecting to 171.67.108.105:8080
06:37:26:WU00:FS02:Downloading 20.15MiB
06:37:31:WU00:FS02:Download complete
06:37:31:WU00:FS02:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9177 run:7 clone:8 gen:50 core:0x21 unit:0x0000003dab436c6957b24c29a356c742
06:37:31:WU00:FS02:Starting
06:37:31:WU00:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/nickm/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 8256 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
06:37:31:WU00:FS02:Started FahCore on PID 5784
06:37:31:WU00:FS02:Core PID:7772
06:37:31:WU00:FS02:FahCore 0x21 started
06:37:33:WU00:FS02:0x21:*********************** Log Started 2016-10-23T06:37:32Z ***********************
06:37:33:WU00:FS02:0x21:Project: 9177 (Run 7, Clone 8, Gen 50)
06:37:33:WU00:FS02:0x21:Unit: 0x0000003dab436c6957b24c29a356c742
06:37:33:WU00:FS02:0x21:CPU: 0x00000000000000000000000000000000
06:37:33:WU00:FS02:0x21:Machine: 2
06:37:33:WU00:FS02:0x21:Reading tar file core.xml
06:37:33:WU00:FS02:0x21:Reading tar file integrator.xml
06:37:33:WU00:FS02:0x21:Reading tar file state.xml
06:37:33:WU00:FS02:0x21:Reading tar file system.xml
06:37:33:WU00:FS02:0x21:Digital signatures verified
06:37:33:WU00:FS02:0x21:Folding@home GPU Core21 Folding@home Core
06:37:33:WU00:FS02:0x21:Version 0.0.17
06:37:38:WU00:FS02:0x21:ERROR:exception: Error downloading array interactionCount: clEnqueueReadBuffer (-5)
06:37:38:WU00:FS02:0x21:Saving result file logfile_01.txt
06:37:38:WU00:FS02:0x21:Saving result file log.txt
06:37:38:WU00:FS02:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
06:37:42:WARNING:WU00:FS02:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:37:42:WU00:FS02:Sending unit results: id:00 state:SEND error:FAULTY project:9177 run:7 clone:8 gen:50 core:0x21 unit:0x0000003dab436c6957b24c29a356c742
06:37:42:WU00:FS02:Uploading 7.50KiB to 171.67.108.105
06:37:42:WU00:FS02:Connecting to 171.67.108.105:8080
06:37:47:WU00:FS02:Upload complete
06:37:47:WU00:FS02:Server responded WORK_ACK (400)
06:37:47:WU00:FS02:Cleaning up
06:37:50:WU00:FS01:Connecting to 171.67.108.45:80
06:37:51:WU00:FS01:Assigned to work server 171.67.108.155
06:37:51:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM204 [GeForce GTX 980] from 171.67.108.155
06:37:51:WU00:FS01:Connecting to 171.67.108.155:8080
06:37:54:WU00:FS01:Downloading 902.77KiB
06:37:55:WU00:FS01:Download complete
06:37:55:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9660 run:0 clone:73 gen:87 core:0x18 unit:0x00000069ab436c9b56de69ba9dccc137
06:37:55:WU00:FS01:Starting
06:37:55:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/nickm/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18.exe -dir 00 -suffix 01 -version 704 -lifeline 8256 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
06:37:55:WU00:FS01:Started FahCore on PID 1176
06:37:55:WU00:FS01:Core PID:14804
06:37:55:WU00:FS01:FahCore 0x18 started
06:37:57:WU00:FS01:0x18:*********************** Log Started 2016-10-23T06:37:57Z ***********************
06:37:57:WU00:FS01:0x18:Project: 9660 (Run 0, Clone 73, Gen 87)
06:37:57:WU00:FS01:0x18:Unit: 0x00000069ab436c9b56de69ba9dccc137
06:37:57:WU00:FS01:0x18:CPU: 0x00000000000000000000000000000000
06:37:57:WU00:FS01:0x18:Machine: 1
06:37:57:WU00:FS01:0x18:Reading tar file core.xml
06:37:57:WU00:FS01:0x18:Reading tar file integrator.xml
06:37:57:WU00:FS01:0x18:Reading tar file state.xml
06:37:57:WU00:FS01:0x18:Reading tar file system.xml
06:37:57:WU00:FS01:0x18:Digital signatures verified
06:37:57:WU00:FS01:0x18:Folding@home GPU core18
06:37:57:WU00:FS01:0x18:Version 0.0.4
06:38:06:WU00:FS01:0x18:Completed 0 out of 2000000 steps (0%)
06:38:06:WU00:FS01:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
06:38:37:WU00:FS01:0x18:Completed 20000 out of 2000000 steps (1%)
06:39:12:WU00:FS01:0x18:Completed 40000 out of 2000000 steps (2%)
06:39:40:WU00:FS01:0x18:Completed 60000 out of 2000000 steps (3%)
06:40:11:WU00:FS01:0x18:Completed 80000 out of 2000000 steps (4%)
06:40:39:WU00:FS01:0x18:Completed 100000 out of 2000000 steps (5%)
06:41:11:WU00:FS01:0x18:Completed 120000 out of 2000000 steps (6%)
06:41:39:WU00:FS01:0x18:Completed 140000 out of 2000000 steps (7%)
06:42:07:WU00:FS01:0x18:Completed 160000 out of 2000000 steps (8%)
06:42:35:WU00:FS01:0x18:Completed 180000 out of 2000000 steps (9%)
06:43:03:WU00:FS01:0x18:Completed 200000 out of 2000000 steps (10%)
06:43:34:WU00:FS01:0x18:Completed 220000 out of 2000000 steps (11%)
06:44:02:WU00:FS01:0x18:Completed 240000 out of 2000000 steps (12%)
06:44:30:WU00:FS01:0x18:Completed 260000 out of 2000000 steps (13%)
06:44:58:WU00:FS01:0x18:Completed 280000 out of 2000000 steps (14%)
06:45:26:WU00:FS01:0x18:Completed 300000 out of 2000000 steps (15%)
06:45:57:WU00:FS01:0x18:Completed 320000 out of 2000000 steps (16%)
06:46:25:WU00:FS01:0x18:Completed 340000 out of 2000000 steps (17%)
06:46:53:WU00:FS01:0x18:Completed 360000 out of 2000000 steps (18%)
06:47:21:WU00:FS01:0x18:Completed 380000 out of 2000000 steps (19%)
06:47:49:WU00:FS01:0x18:Completed 400000 out of 2000000 steps (20%)
06:48:20:WU00:FS01:0x18:Completed 420000 out of 2000000 steps (21%)
BTW, I have stopped folding on my CPU slots as we are heading to summer and I want to reduce thermal overload.

Any suggestions?

Cheers

Nick
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Bad work units on GPU slot

Post by Joe_H »

See this topic - viewtopic.php?f=80&t=29276. You will need to roll back to an older version of the drivers, the 375.57 release was for gaming compatibility and seems to have broken OpenCL processing needed for folding.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Nick200
Posts: 86
Joined: Fri Jun 20, 2014 6:40 am
Location: New Zealand

Re: Bad work units on GPU slot

Post by Nick200 »

Thanks, Joe_H,
So, back to 373.06? Just tried that on the affected machine - and it seems to work fine ...
Cheers
Nick
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Bad work units on GPU slot

Post by Aurum »

Just got my first Nvidia card yesterday and all it's done is crash my PC. I bought a used EVGA 980 Ti off eBay. I swapped out my RAM, updated my Samsung 840 EVO firmware, updated to latest MB BIOS, disabled onboard graphics. The PC was running fine before with any of several generations of AMD cards. I did a Safe Mode AMD driver clean using Guru-3D DDU. I started by trying the Nvidia 373.06 driver. Then I did a DDU clean and tried the 378.48 driver.

The article about Linux benchmarking 7im posted gave me the idea to look for a Windows video rendering benchmark program and I tried Unigine Heaven Benchmark 4.0 that moves through 25,000 frames of the tour of a floating village. My RX 480 breezed through it even with F@H 7.4.15 running at Full, a bit herky jerky.
FPS 36.5
Score 919
Min FPS 8.4
Max FPS 104.2
After quitting F@H:
FPS 97.8
Score 2463
Min FPS 15.8
Max FPS 170.1

The used 980 Ti crashed PC at scene 21 of 26.
When the PC crashes the blue SCREEN says: DRIVER_IRQL_NOT_LESS_OR_EQUAL
Anyone know what this means :?: :?: :?:

I've heard of Furmark and others. Any recommendations for assuring my graphics cards are suitable for folding :?:
https://unigine.com/products/benchmarks/heaven/
In Science We Trust Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Bad work units on GPU slot

Post by bruce »

DRIVER_IRQL_NOT_LESS_OR_EQUAL is certainly a sign of a driver problem but that also could be mimicked by a hardware problem. I'd recommend (A) a clean install of whichever driver version you choose: 373.06 or 376.48 followed by a reboot followed by a reinstall of FAHClient software.

It sounds like you've tried several recommended fixes. First, I'd like to look at your log to see what it says after you've taken those steps.

FAHBench is pretty versatile and it will normally give you a good indication if your GPU is set up properly to run FAH.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Bad work units on GPU slot

Post by foldy »

Did the gtx 980 ti fail in Heaven Benchmark while you were folding? In general your should not do folding and heavy gaming on the same GPU concurrently.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Bad work units on GPU slot

Post by Aurum »

This is the 4th time I did a complete driver clean and install and F@H 7.4.15 stills makes bad WUs. I left it at 80% after an Express install and it's yet to crash, but it's only a matter of time. To be clear:
Uninstalled F@H 7.4.15 and deleted Apps/Roaming/FAHclient folder.
Uninstalled 376.48 using DDU, first it reboots in Safe Mode, then slected Nvidia clean (bottom button), then a driver clean (top button), rebooted.
Installed 376.48 Custom selecting only the graphics driver and not the 3d stuff or gexperience or physcalc.
Installed F@H 7.4.15 Express and let it launch.
Hopefully my 1070 arrives today and I can test PC with a Nvidia different card.

Code: Select all

*********************** Log Started 2017-01-04T20:24:08Z ***********************
20:24:08:************************* Folding@home Client *************************
20:24:08:        Website: http://folding.stanford.edu/
20:24:08:      Copyright: (c) 2009-2016 Stanford University
20:24:08:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
20:24:08:           Args: --open-web-control
20:24:08:         Config: <none>
20:24:08:******************************** Build ********************************
20:24:08:        Version: 7.4.15
20:24:08:           Date: Aug 17 2016
20:24:08:           Time: 04:33:41
20:24:08:     Repository: Git
20:24:08:       Revision: 4f3e0e25571a9f691719f0c273739294bde517dd
20:24:08:         Branch: master
20:24:08:       Compiler: GNU 5.3.1 20160205
20:24:08:        Options: -std=gnu++98 -I/mingw64/include -O3 -funroll-loops -ffast-math
20:24:08:                 -mfpmath=sse -fno-unsafe-math-optimizations -msse2
20:24:08:       Platform: linux2 4.6.0-1-amd64
20:24:08:           Bits: 64
20:24:08:           Mode: Release
20:24:08:******************************* System ********************************
20:24:08:            CPU: Intel(R) Core(TM) i3-4130T CPU @ 2.90GHz
20:24:08:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
20:24:08:           CPUs: 4
20:24:08:         Memory: 15.68GiB
20:24:08:    Free Memory: 13.99GiB
20:24:08:        Threads: WINDOWS_THREADS
20:24:08:     OS Version: 6.1
20:24:08:    Has Battery: false
20:24:08:     On Battery: false
20:24:08:     UTC Offset: -8
20:24:08:            PID: 3252
20:24:08:            CWD: C:\Users\BitCoin_Node\AppData\Roaming\FAHClient
20:24:08:             OS: Windows 7 Home Premium Service Pack 1
20:24:08:        OS Arch: AMD64
20:24:08:           GPUs: 0
20:24:08:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:5.2 Driver:8.0
20:24:08:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:4.2
20:24:08:OpenCL Device 1: Platform:0 Device:1 Bus:NA Slot:NA Compute:1.2 Driver:10.18
20:24:08:OpenCL Device 2: Platform:1 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:376.48
20:24:08:  Win32 Service: false
20:24:08:***********************************************************************
20:24:08:<config>
20:24:08:  <!-- Folding Slots -->
20:24:08:</config>
20:24:08:Connecting to assign-GPU.stanford.edu:80
20:24:09:Updated GPUs.txt
20:24:09:Read GPUs.txt
20:24:09:Trying to access database...
20:24:09:Successfully acquired database lock
20:24:09:Enabled folding slot 00: PAUSED cpu:2 (not configured)
20:24:09:Enabled folding slot 01: PAUSED gpu:0:GM200 [GeForce GTX 980 Ti] (not configured)
20:24:17:12:127.0.0.1:New Web connection
20:25:03:Set client configured
20:25:04:WU00:FS00:Connecting to 171.67.108.45:8080
20:25:04:WU01:FS01:Connecting to 171.67.108.45:8080
20:25:06:WU00:FS00:Connecting to 171.67.108.45:8080
20:25:06:WU01:FS01:Connecting to 171.67.108.45:80
20:25:07:WU00:FS00:Assigned to work server 171.67.108.158
20:25:07:WU00:FS00:Requesting new work unit for slot 00: READY cpu:2 from 171.67.108.158
20:25:07:WU00:FS00:Connecting to 171.67.108.158:8080
20:25:07:WU01:FS01:Assigned to work server 171.67.108.105
20:25:07:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GM200 [GeForce GTX 980 Ti] from 171.67.108.105
20:25:07:WU01:FS01:Connecting to 171.67.108.105:8080
20:25:08:WU01:FS01:Downloading 20.73MiB
20:25:08:WU00:FS00:Downloading 806.97KiB
20:25:10:Saving configuration to config.xml
20:25:10:<config>
20:25:10:  <!-- User Information -->
20:25:10:  <passkey v='********************************'/>
20:25:10:  <team v='224497'/>
20:25:10:  <user v='Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ'/>
20:25:10:
20:25:10:  <!-- Folding Slots -->
20:25:10:  <slot id='0' type='CPU'/>
20:25:10:  <slot id='1' type='GPU'/>
20:25:10:</config>
20:25:14:WU01:FS01:Download 1.21%
20:25:15:WU00:FS00:Download 39.65%
20:25:20:WU01:FS01:Download 2.11%
20:25:22:WU00:FS00:Download 71.38%
20:25:26:WU01:FS01:Download 3.32%
20:25:28:WU00:FS00:Download 100.00%
20:25:28:WU00:FS00:Download complete
20:25:28:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9035 run:75 clone:3 gen:623 core:0xa4 unit:0x000002aaab436c9e56982ef03f14a0db
20:25:28:WU00:FS00:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah
20:25:28:WU00:FS00:Connecting to web.stanford.edu:80
20:25:28:WU00:FS00:FahCore a4: Downloading 2.89MiB
20:25:32:WU01:FS01:Download 5.43%
20:25:35:WU00:FS00:FahCore a4: 4.33%
20:25:38:WU01:FS01:Download 8.14%
20:25:41:WU00:FS00:FahCore a4: 17.31%
20:25:44:WU01:FS01:Download 9.95%
20:25:50:WU00:FS00:FahCore a4: 25.97%
20:25:50:WU01:FS01:Download 11.46%
20:25:57:WU00:FS00:FahCore a4: 32.46%
20:25:57:WU01:FS01:Download 12.96%
20:25:59:Saving configuration to config.xml
20:25:59:<config>
20:25:59:  <!-- HTTP Server -->
20:25:59:  <allow v='192.168.1.253'/>
20:25:59:
20:25:59:  <!-- Network -->
20:25:59:  <proxy v=':8080'/>
20:25:59:
20:25:59:  <!-- Remote Command Server -->
20:25:59:  <password v='*******'/>
20:25:59:
20:25:59:  <!-- User Information -->
20:25:59:  <passkey v='********************************'/>
20:25:59:  <team v='224497'/>
20:25:59:  <user v='Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ'/>
20:25:59:
20:25:59:  <!-- Folding Slots -->
20:25:59:  <slot id='0' type='CPU'/>
20:25:59:  <slot id='1' type='GPU'/>
20:25:59:</config>
20:25:59:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
20:26:03:WU00:FS00:FahCore a4: 36.79%
20:26:03:WU01:FS01:Download 14.17%
20:26:09:WU01:FS01:Download 15.38%
20:26:10:WU00:FS00:FahCore a4: 45.45%
20:26:11:Saving configuration to config.xml
20:26:11:<config>
20:26:11:  <!-- HTTP Server -->
20:26:11:  <allow v='192.168.1.253'/>
20:26:11:
20:26:11:  <!-- Network -->
20:26:11:  <proxy v=':8080'/>
20:26:11:
20:26:11:  <!-- Remote Command Server -->
20:26:11:  <password v='*******'/>
20:26:11:
20:26:11:  <!-- User Information -->
20:26:11:  <passkey v='********************************'/>
20:26:11:  <team v='224497'/>
20:26:11:  <user v='Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ'/>
20:26:11:
20:26:11:  <!-- Folding Slots -->
20:26:11:  <slot id='0' type='CPU'/>
20:26:11:  <slot id='1' type='GPU'/>
20:26:11:</config>
20:26:15:WU01:FS01:Download 16.58%
20:26:17:WU00:FS00:FahCore a4: 49.78%
20:26:21:WU01:FS01:Download 17.49%
20:26:26:WU00:FS00:FahCore a4: 54.10%
20:26:28:WU01:FS01:Download 18.69%
20:26:29:Saving configuration to config.xml
20:26:29:<config>
20:26:29:  <!-- HTTP Server -->
20:26:29:  <allow v='192.168.1.253'/>
20:26:29:
20:26:29:  <!-- Network -->
20:26:29:  <proxy v=':8080'/>
20:26:29:
20:26:29:  <!-- Remote Command Server -->
20:26:29:  <password v='*******'/>
20:26:29:
20:26:29:  <!-- User Information -->
20:26:29:  <passkey v='********************************'/>
20:26:29:  <team v='224497'/>
20:26:29:  <user v='Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ'/>
20:26:29:
20:26:29:  <!-- Folding Slots -->
20:26:29:  <slot id='0' type='CPU'/>
20:26:29:  <slot id='1' type='GPU'/>
20:26:29:</config>
20:26:33:WU00:FS00:FahCore a4: 60.60%
20:26:35:WU01:FS01:Download 19.29%
20:26:40:WU00:FS00:FahCore a4: 69.25%
20:26:42:WU01:FS01:Download 20.80%
20:26:47:WU00:FS00:FahCore a4: 77.91%
20:26:48:WU01:FS01:Download 22.91%
20:26:54:WU01:FS01:Download 24.72%
20:26:55:WU00:FS00:FahCore a4: 82.24%
20:27:00:WU01:FS01:Download 26.23%
20:27:03:WU00:FS00:FahCore a4: 95.22%
20:27:06:WU00:FS00:FahCore a4: Download complete
20:27:07:WU00:FS00:Valid core signature
20:27:07:WU00:FS00:Unpacked 9.59MiB to cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe
20:27:07:WU00:FS00:Starting
20:27:07:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\BitCoin_Node\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -np 2
20:27:07:WU01:FS01:Download 27.74%
20:27:07:WU00:FS00:Started FahCore on PID 4764
20:27:08:WU00:FS00:Core PID:4776
20:27:08:WU00:FS00:FahCore 0xa4 started
20:27:08:WU00:FS00:0xa4:
20:27:08:WU00:FS00:0xa4:*------------------------------*
20:27:08:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
20:27:08:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
20:27:08:WU00:FS00:0xa4:
20:27:08:WU00:FS00:0xa4:Preparing to commence simulation
20:27:08:WU00:FS00:0xa4:- Looking at optimizations...
20:27:08:WU00:FS00:0xa4:- Created dyn
20:27:08:WU00:FS00:0xa4:- Files status OK
20:27:08:WU00:FS00:0xa4:- Expanded 825830 -> 1402156 (decompressed 169.7 percent)
20:27:08:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=825830 data_size=1402156, decompressed_data_size=1402156 diff=0
20:27:08:WU00:FS00:0xa4:- Digital signature verified
20:27:08:WU00:FS00:0xa4:
20:27:08:WU00:FS00:0xa4:Project: 9035 (Run 75, Clone 3, Gen 623)
20:27:08:WU00:FS00:0xa4:
20:27:08:WU00:FS00:0xa4:Assembly optimizations on if available.
20:27:08:WU00:FS00:0xa4:Entering M.D.
20:27:13:WU01:FS01:Download 29.24%
20:27:14:WU00:FS00:0xa4:Mapping NT from 2 to 2 
20:27:14:WU00:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
20:27:19:WU01:FS01:Download 31.05%
20:27:25:WU01:FS01:Download 32.56%
20:27:31:WU01:FS01:Download 33.77%
20:27:37:WU01:FS01:Download 34.97%
20:27:43:WU01:FS01:Download 36.48%
20:27:49:WU01:FS01:Download 37.99%
20:27:55:WU01:FS01:Download 40.10%
20:28:01:WU01:FS01:Download 42.51%
20:28:07:WU01:FS01:Download 44.32%
20:28:13:WU01:FS01:Download 47.03%
20:28:19:WU01:FS01:Download 49.74%
20:28:25:WU01:FS01:Download 51.85%
20:28:31:WU01:FS01:Download 54.27%
20:28:37:WU01:FS01:Download 56.07%
20:28:43:WU01:FS01:Download 58.79%
20:28:50:WU01:FS01:Download 61.20%
20:28:56:WU01:FS01:Download 63.31%
20:29:02:WU01:FS01:Download 64.82%
20:29:08:WU01:FS01:Download 65.72%
20:29:14:WU01:FS01:Download 67.23%
20:29:20:WU01:FS01:Download 68.74%
20:29:27:WU01:FS01:Download 70.24%
20:29:33:WU01:FS01:Download 71.45%
20:29:39:WU01:FS01:Download 73.56%
20:29:45:WU01:FS01:Download 75.07%
20:29:51:WU01:FS01:Download 77.48%
20:29:57:WU01:FS01:Download 80.19%
20:30:03:WU01:FS01:Download 82.00%
20:30:09:WU01:FS01:Download 84.11%
20:30:15:WU01:FS01:Download 85.92%
20:30:21:WU01:FS01:Download 87.73%
20:30:27:WU01:FS01:Download 89.24%
20:30:33:WU01:FS01:Download 91.05%
20:30:39:WU01:FS01:Download 92.55%
20:30:45:WU01:FS01:Download 94.36%
20:30:51:WU01:FS01:Download 96.47%
20:30:54:WU00:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
20:30:57:WU01:FS01:Download 97.68%
20:31:03:WU01:FS01:Download 99.19%
20:31:06:WU01:FS01:Download complete
20:31:06:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9177 run:17 clone:15 gen:160 core:0x21 unit:0x0000010bab436c6957b24c29f65f352b
20:31:06:WU01:FS01:Downloading core from http://web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah
20:31:06:WU01:FS01:Connecting to web.stanford.edu:80
20:31:06:WU01:FS01:FahCore 21: Downloading 3.47MiB
20:31:12:WU01:FS01:FahCore 21: 7.20%
20:31:19:WU01:FS01:FahCore 21: 10.80%
20:31:26:WU01:FS01:FahCore 21: 16.20%
20:31:32:WU01:FS01:FahCore 21: 21.60%
20:31:38:WU01:FS01:FahCore 21: 28.80%
20:31:44:WU01:FS01:FahCore 21: 35.99%
20:31:50:WU01:FS01:FahCore 21: 44.99%
20:31:56:WU01:FS01:FahCore 21: 52.19%
20:32:02:WU01:FS01:FahCore 21: 59.39%
20:32:09:WU01:FS01:FahCore 21: 64.79%
20:32:15:WU01:FS01:FahCore 21: 71.99%
20:32:21:WU01:FS01:FahCore 21: 77.39%
20:32:27:WU01:FS01:FahCore 21: 86.39%
20:32:34:WU01:FS01:FahCore 21: 91.79%
20:32:40:WU01:FS01:FahCore 21: 98.99%
20:32:40:WU01:FS01:FahCore 21: Download complete
20:32:40:WU01:FS01:Valid core signature
20:32:40:WU01:FS01:Unpacked 11.81MiB to cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe
20:32:40:WU01:FS01:Starting
20:32:40:WU01:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\BitCoin_Node\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -opencl-platform 1 -gpu-vendor nvidia -gpu 0
20:32:40:WU01:FS01:Started FahCore on PID 2896
20:32:41:WU01:FS01:Core PID:2248
20:32:41:WU01:FS01:FahCore 0x21 started
20:32:42:WU01:FS01:0x21:*********************** Log Started 2017-01-04T20:32:42Z ***********************
20:32:42:WU01:FS01:0x21:Project: 9177 (Run 17, Clone 15, Gen 160)
20:32:42:WU01:FS01:0x21:Unit: 0x0000010bab436c6957b24c29f65f352b
20:32:42:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
20:32:42:WU01:FS01:0x21:Machine: 1
20:32:42:WU01:FS01:0x21:Reading tar file core.xml
20:32:42:WU01:FS01:0x21:Reading tar file integrator.xml
20:32:42:WU01:FS01:0x21:Reading tar file state.xml
20:32:42:WU01:FS01:0x21:Reading tar file system.xml
20:32:42:WU01:FS01:0x21:Digital signatures verified
20:32:42:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
20:32:42:WU01:FS01:0x21:Version 0.0.17
20:32:53:WU01:FS01:0x21:Completed 0 out of 2500000 steps (0%)
20:32:53:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
20:33:58:WU01:FS01:0x21:Completed 25000 out of 2500000 steps (1%)
20:35:03:WU00:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
20:35:04:WU01:FS01:0x21:Completed 50000 out of 2500000 steps (2%)
20:36:10:WU01:FS01:0x21:Completed 75000 out of 2500000 steps (3%)
20:37:16:WU01:FS01:0x21:Completed 100000 out of 2500000 steps (4%)
20:37:17:WU01:FS01:0x21:ERROR:exception: Error downloading array energyBuffer: clEnqueueReadBuffer (-5)
20:37:17:WU01:FS01:0x21:Saving result file logfile_01.txt
20:37:17:WU01:FS01:0x21:Saving result file log.txt
20:37:17:WU01:FS01:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
20:37:19:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
20:37:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:9177 run:17 clone:15 gen:160 core:0x21 unit:0x0000010bab436c6957b24c29f65f352b
20:37:19:WU01:FS01:Uploading 8.50KiB to 171.67.108.105
20:37:19:WU01:FS01:Connecting to 171.67.108.105:8080
20:37:19:WU02:FS01:Connecting to 171.67.108.45:80
20:37:20:WU02:FS01:Assigned to work server 140.163.4.231
20:37:20:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GM200 [GeForce GTX 980 Ti] from 140.163.4.231
20:37:20:WU02:FS01:Connecting to 140.163.4.231:8080
20:37:20:WU01:FS01:Upload complete
20:37:20:WU01:FS01:Server responded WORK_ACK (400)
20:37:20:WU01:FS01:Cleaning up
20:37:21:WU02:FS01:Downloading 16.73MiB
20:37:28:WU02:FS01:Download 1.87%
20:37:36:WU02:FS01:Download 3.74%
20:37:43:WU02:FS01:Download 4.86%
20:37:50:WU02:FS01:Download 6.35%
20:37:58:WU02:FS01:Download 7.47%
20:38:04:WU02:FS01:Download 8.22%
20:38:10:WU02:FS01:Download 9.72%
20:38:18:WU02:FS01:Download 11.21%
20:38:25:WU02:FS01:Download 12.33%
20:38:31:WU02:FS01:Download 13.83%
20:38:38:WU02:FS01:Download 14.95%
20:38:44:WU02:FS01:Download 15.69%
20:38:51:WU02:FS01:Download 17.19%
20:38:57:WU02:FS01:Download 19.06%
20:39:04:WU02:FS01:Download 20.93%
20:39:07:WU00:FS00:0xa4:Completed 7500 out of 250000 steps  (3%)
20:39:10:WU02:FS01:Download 23.17%
20:39:16:WU02:FS01:Download 26.16%
20:39:22:WU02:FS01:Download 28.40%
20:39:28:WU02:FS01:Download 32.14%
20:39:34:WU02:FS01:Download 33.63%
20:39:40:WU02:FS01:Download 35.50%
20:39:46:WU02:FS01:Download 36.99%
20:39:53:WU02:FS01:Download 38.49%
20:39:59:WU02:FS01:Download 39.98%
In Science We Trust Image
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Bad work units on GPU slot

Post by Aurum »

foldy wrote:Did the gtx 980 ti fail in Heaven Benchmark while you were folding? In general your should not do folding and heavy gaming on the same GPU concurrently.
The GTX 980 Ti failed Unigine Heaven Benchmark 4.0 after a reboot sans loading F@H as I had deleted it's shortcut from my Startup folder. The RX 480 was already folding and I didn't really want to stop it and loose all the work back to some random checkpoints. I've never played one of those video games so I got curious and stopped F@H anyway to see had smooth my RX 480 did. One thing that's dramatically different between the AMD cards and the only, albeit defective, Nvidia card I've ever tried is that the AMD cards do not slow F@H down much even if an intense program is running simulataneously. The GTX 980 Ti slowed to a crawl when another program runs, about 20% of the estimated PPD as compared to before running the second program.

I DLed the 3.9 GB Furmark 3DMark-v2-2-3509.zip yesterday but it took over 5 hours. I'll install it and try it today.
In Science We Trust Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Bad work units on GPU slot

Post by bruce »

Aurum wrote:
foldy wrote:The GTX 980 Ti slowed to a crawl when another program runs, about 20% of the estimated PPD as compared to before running the second program.
How did you determine PPD? Specifically, how long did you wait while running the other program for the reported PPD to stabilize?

Did you record average frame times during those tests?

Was your monitor being driven by the AMD card or by the NVidia card?
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Bad work units on GPU slot

Post by Aurum »

bruce wrote:
Aurum wrote:
foldy wrote:The GTX 980 Ti slowed to a crawl when another program runs, about 20% of the estimated PPD as compared to before running the second program.
How did you determine PPD? Specifically, how long did you wait while running the other program for the reported PPD to stabilize?
I watched the Estimated PPD & TBF on the FAHControl/Status/Selected Work Unit panel. They changed dramatically after each 1% was completed. It never made more a few percent into a WU before the GTX 980 Ti crashed my PC. If you're interested I can test it the way you'd like after my 1070 arrives.
bruce wrote:Did you record average frame times during those tests?
Nope, just anecdotal empirical observation that repeated several times.
bruce wrote:Was your monitor being driven by the AMD card or by the NVidia card?
Two different PCs one with an RX 480 droving its monitor and the other with a GTX 980 Ti driving its monitor.
In Science We Trust Image
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Re: Bad work units on NV GPU slot

Post by toTOW »

Did you use AMD Cleanup tool to remove previous driver (and more specifically, AMD OpenCL platform that will remain to support CPU OpenCL) ? This might be conflicting with NV OpenCL platform ...

After running the AMD Cleanup tool, you'll have to reinstall the NV drivers, beacause AMD tool will get rif of all installed OpenCL platforms ...

I'm afraid that your used (which is important here) 980 Ti might be broken ... not running 3D apps/benchmarks is not a good sign ... do you see the fans spinning ? did you look at the GPU sensors with GPUZ while running your tests ? Is everything normal before crashing ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Bad work units on NV GPU slot

Post by Aurum »

No I used the Display Driver Utility (DDU 17.0.4.2) to remove AMD drivers. DDU seems to work fine for AMD drivers but my problem may have been using it to remove Nvidia drivers. I believe my original mistake may have been to install the latest Nvidia driver (lower than the 376.48 hotfix version). Then I uninstalled it using DDU and the not the Nvidia Uninstall routine in the Win7 Programs and Features. After I finally tried Nvidia's Uninstall utility and then reinstalled 376.48 it worked. I've since moved the 980 Ti to a much better MB (MSI 890FXA-GD70) and added two 1070s. They're all on 16x slots and I'm updating BIOS etc. The two new 1070s worked plug'n play style as expected.
In Science We Trust Image
Post Reply