Constant freezes while GPU folding

Moderators: Site Moderators, FAHC Science Team

Post Reply
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Constant freezes while GPU folding

Post by ajm »

I encountered the problem once or twice before but now, it is close to unbearable. I can't type 20 characters in a row.

Code: Select all

*********************** Log Started 2020-04-08T09:22:50Z ***********************
09:22:50:************************* Folding@home Client *************************
09:22:50:        Website: https://foldingathome.org/
09:22:50:      Copyright: (c) 2009-2018 foldingathome.org
09:22:50:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
09:22:50:           Args: --open-web-control
09:22:50:         Config: <none>
09:22:50:******************************** Build ********************************
09:22:50:        Version: 7.5.1
09:22:50:           Date: May 11 2018
09:22:50:           Time: 13:06:32
09:22:50:     Repository: Git
09:22:50:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
09:22:50:         Branch: master
09:22:50:       Compiler: Visual C++ 2008
09:22:50:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
09:22:50:       Platform: win32 10
09:22:50:           Bits: 32
09:22:50:           Mode: Release
09:22:50:******************************* System ********************************
09:22:50:            CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
09:22:50:         CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
09:22:50:           CPUs: 32
09:22:50:         Memory: 127.88GiB
09:22:50:    Free Memory: 54.97GiB
09:22:50:        Threads: WINDOWS_THREADS
09:22:50:     OS Version: 6.2
09:22:50:    Has Battery: true
09:22:50:     On Battery: false
09:22:50:     UTC Offset: 2
09:22:50:            PID: 27244
09:22:50:            CWD: C:\Users\AJM\AppData\Roaming\FAHClient
09:22:50:             OS: Windows 10 Enterprise
09:22:50:        OS Arch: AMD64
09:22:50:           GPUs: 0
09:22:50:           CUDA: Not detected: cuInit() returned 999
09:22:50:OpenCL Device 0: Platform:0 Device:0 Bus:3 Slot:0 Compute:1.2 Driver:3004.8
09:22:50:  Win32 Service: false
09:22:50:***********************************************************************
09:22:50:<config>
09:22:50:  <!-- Folding Slots -->
09:22:50:</config>
09:22:50:Connecting to assign1.foldingathome.org:8080
09:22:50:Updated GPUs.txt
The GPU is a 5700 XT with PCIe Gen 4 (like the MB). During folding, the GPU is stuck at 99%. It's one red block in GPU-Z, and the computer is totally unresponsive.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Constant freezes while GPU folding

Post by foldy »

Maybe try to disable internet browser gpu hardware acceleration, so it does not compete with FAH.
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Constant freezes while GPU folding

Post by ajm »

Aaaah, it's over at long last. Actually, it looks like I had a WU stuck at 99.99%. In the log, it was delivered and acknowledged, but was still in the work, anyhow. I tried to pause FAH, and I got that:

Image

So I stopped FAH (quit), then uninstalled and reinstalled it. But after that my GPU is still stuck at 99%:

Image

Don't know what she's doing. And I can't really make sense of the log (part concerning the Wu 11762 (0, 8956, 2(:

Code: Select all

10:08:04:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11762 run:0 clone:8956 gen:2 core:0x22 unit:0x0000000880fccb0a5e7113c4724be7e2
10:08:04:WU02:FS01:Starting
10:08:04:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\AJM\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 705 -lifeline 27244 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:08:04:WU02:FS01:Started FahCore on PID 12536
10:08:04:WU02:FS01:Core PID:8564
10:08:04:WU02:FS01:FahCore 0x22 started
10:08:05:WU02:FS01:0x22:*********************** Log Started 2020-04-08T10:08:05Z ***********************
10:08:05:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
10:08:05:WU02:FS01:0x22:       Type: 0x22
10:08:05:WU02:FS01:0x22:       Core: Core22
10:08:05:WU02:FS01:0x22:    Website: https://foldingathome.org/
10:08:05:WU02:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
10:08:05:WU02:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
10:08:05:WU02:FS01:0x22:             <rafal.wiewiora@choderalab.org>
10:08:05:WU02:FS01:0x22:       Args: -dir 02 -suffix 01 -version 705 -lifeline 12536 -checkpoint 15
10:08:05:WU02:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:08:05:WU02:FS01:0x22:     Config: <none>
10:08:05:WU02:FS01:0x22:************************************ Build *************************************
10:08:05:WU02:FS01:0x22:    Version: 0.0.2
10:08:05:WU02:FS01:0x22:       Date: Dec 6 2019
10:08:05:WU02:FS01:0x22:       Time: 21:30:31
10:08:05:WU02:FS01:0x22: Repository: Git
10:08:05:WU02:FS01:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
10:08:05:WU02:FS01:0x22:     Branch: HEAD
10:08:05:WU02:FS01:0x22:   Compiler: Visual C++ 2008
10:08:05:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:08:05:WU02:FS01:0x22:   Platform: win32 10
10:08:05:WU02:FS01:0x22:       Bits: 64
10:08:05:WU02:FS01:0x22:       Mode: Release
10:08:05:WU02:FS01:0x22:************************************ System ************************************
10:08:05:WU02:FS01:0x22:        CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
10:08:05:WU02:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
10:08:05:WU02:FS01:0x22:       CPUs: 64
10:08:05:WU02:FS01:0x22:     Memory: 127.88GiB
10:08:05:WU02:FS01:0x22:Free Memory: 56.25GiB
10:08:05:WU02:FS01:0x22:    Threads: WINDOWS_THREADS
10:08:05:WU02:FS01:0x22: OS Version: 6.2
10:08:05:WU02:FS01:0x22:Has Battery: true
10:08:05:WU02:FS01:0x22: On Battery: false
10:08:05:WU02:FS01:0x22: UTC Offset: 2
10:08:05:WU02:FS01:0x22:        PID: 8564
10:08:05:WU02:FS01:0x22:        CWD: C:\Users\AJM\AppData\Roaming\FAHClient\work
10:08:05:WU02:FS01:0x22:         OS: Windows 10 Pro
10:08:05:WU02:FS01:0x22:    OS Arch: AMD64
10:08:05:WU02:FS01:0x22:********************************************************************************
10:08:05:WU02:FS01:0x22:Project: 11762 (Run 0, Clone 8956, Gen 2)
10:08:05:WU02:FS01:0x22:Unit: 0x0000000880fccb0a5e7113c4724be7e2
10:08:05:WU02:FS01:0x22:Reading tar file core.xml
10:08:05:WU02:FS01:0x22:Reading tar file integrator.xml
10:08:05:WU02:FS01:0x22:Reading tar file state.xml
10:08:06:WU02:FS01:0x22:Reading tar file system.xml
10:08:06:WU02:FS01:0x22:Digital signatures verified
10:08:06:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
10:08:06:WU02:FS01:0x22:Version 0.0.2
10:08:06:WU00:FS00:Connecting to 65.254.110.245:8080
10:08:07:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:08:07:WU00:FS00:Connecting to 18.218.241.186:80
10:08:07:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:08:07:ERROR:WU00:FS00:Exception: Could not get an assignment
10:08:10:WU01:FS01:Upload 11.76%
10:08:16:WU01:FS01:Upload 12.96%
10:08:22:WU01:FS01:Upload 14.32%
10:08:28:WU01:FS01:Upload 15.51%
10:08:34:WU01:FS01:Upload 16.88%
10:08:40:WU01:FS01:Upload 18.07%
10:08:46:WU01:FS01:Upload 19.43%
10:08:52:WU01:FS01:Upload 20.46%
10:08:58:WU01:FS01:Upload 21.65%
10:09:04:WU01:FS01:Upload 22.84%
10:09:10:WU01:FS01:Upload 24.38%
10:09:16:WU01:FS01:Upload 25.74%
10:09:22:WU01:FS01:Upload 26.94%
10:09:28:WU01:FS01:Upload 28.30%
10:09:34:WU01:FS01:Upload 29.49%
10:09:40:WU01:FS01:Upload 30.69%
10:09:46:WU01:FS01:Upload 32.05%
10:09:52:WU01:FS01:Upload 33.24%
10:09:58:WU01:FS01:Upload 34.61%
10:10:02:WU02:FS01:0x22:Completed 0 out of 1000000 steps (0%)
10:10:02:WU02:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
10:10:05:WU01:FS01:Upload 36.14%
10:10:12:WU01:FS01:Upload 37.33%
10:10:18:WU01:FS01:Upload 38.19%
10:10:24:WU01:FS01:Upload 39.55%
10:10:30:WU01:FS01:Upload 40.91%
10:10:36:WU01:FS01:Upload 42.45%
10:10:42:WU01:FS01:Upload 43.81%
10:10:43:WU00:FS00:Connecting to 65.254.110.245:8080
10:10:44:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:10:44:WU00:FS00:Connecting to 18.218.241.186:80
10:10:44:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:10:44:ERROR:WU00:FS00:Exception: Could not get an assignment
10:10:48:WU01:FS01:Upload 45.18%
10:10:54:WU01:FS01:Upload 46.54%
10:11:00:WU01:FS01:Upload 47.90%
10:11:01:WU02:FS01:0x22:Completed 10000 out of 1000000 steps (1%)
10:11:06:WU01:FS01:Upload 49.27%
10:11:12:WU01:FS01:Upload 50.63%
10:11:19:WU01:FS01:Upload 52.00%
10:11:25:WU01:FS01:Upload 53.36%
10:11:31:WU01:FS01:Upload 54.55%
10:11:37:WU01:FS01:Upload 56.09%
10:11:43:WU01:FS01:Upload 57.28%
10:11:49:WU01:FS01:Upload 58.64%
10:11:56:WU01:FS01:Upload 59.50%
10:11:56:WU02:FS01:0x22:Completed 20000 out of 1000000 steps (2%)
10:12:02:WU01:FS01:Upload 60.69%
10:12:08:WU01:FS01:Upload 62.39%
10:12:15:WU01:FS01:Upload 63.76%
10:12:21:WU01:FS01:Upload 64.78%
10:12:27:WU01:FS01:Upload 66.14%
10:12:33:WU01:FS01:Upload 67.68%
10:12:39:WU01:FS01:Upload 68.87%
10:12:45:WU01:FS01:Upload 70.07%
10:12:51:WU01:FS01:Upload 71.60%
10:12:52:WU02:FS01:0x22:Completed 30000 out of 1000000 steps (3%)
10:12:57:WU01:FS01:Upload 73.30%
10:13:03:WU01:FS01:Upload 74.84%
10:13:09:WU01:FS01:Upload 76.37%
10:13:15:WU01:FS01:Upload 77.74%
10:13:21:WU01:FS01:Upload 78.93%
10:13:27:WU01:FS01:Upload 80.12%
10:13:33:WU01:FS01:Upload 81.32%
10:13:39:WU01:FS01:Upload 82.68%
10:13:45:WU01:FS01:Upload 83.87%
10:13:47:WU02:FS01:0x22:Completed 40000 out of 1000000 steps (4%)
10:13:51:WU01:FS01:Upload 84.90%
10:13:57:WU01:FS01:Upload 86.09%
10:14:03:WU01:FS01:Upload 87.11%
10:14:09:WU01:FS01:Upload 88.31%
10:14:15:WU01:FS01:Upload 89.33%
10:14:21:WU01:FS01:Upload 90.35%
10:14:27:WU01:FS01:Upload 91.89%
10:14:33:WU01:FS01:Upload 93.42%
10:14:39:WU01:FS01:Upload 94.78%
10:14:43:WU02:FS01:0x22:Completed 50000 out of 1000000 steps (5%)
10:14:45:WU01:FS01:Upload 96.32%
10:14:51:WU01:FS01:Upload 97.85%
10:14:57:WU01:FS01:Upload 99.39%
10:14:58:WU00:FS00:Connecting to 65.254.110.245:8080
10:14:58:WARNING:WU00:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
10:14:58:WU00:FS00:Connecting to 18.218.241.186:80
10:14:59:WARNING:WU00:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
10:14:59:ERROR:WU00:FS00:Exception: Could not get an assignment
10:15:04:WU01:FS01:Upload complete
10:15:04:WU01:FS01:Server responded WORK_ACK (400)
10:15:04:WU01:FS01:Cleaning up
10:15:43:WU02:FS01:0x22:Completed 60000 out of 1000000 steps (6%)
10:16:39:WU02:FS01:0x22:Completed 70000 out of 1000000 steps (7%)
10:17:34:WU02:FS01:0x22:Completed 80000 out of 1000000 steps (8%)
10:21:49:WU00:FS00:Connecting to 65.254.110.245:8080
10:21:49:WU00:FS00:Assigned to work server 37.187.12.48
10:21:49:WU00:FS00:Requesting new work unit for slot 00: READY cpu:32 from 37.187.12.48
10:21:49:WU00:FS00:Connecting to 37.187.12.48:8080
10:21:49:ERROR:WU00:FS00:Exception: Server did not assign work unit
10:28:20:WARNING:WU02:FS01:FahCore returned an unknown error code which probably indicates that it crashed
10:28:20:WARNING:WU02:FS01:FahCore returned: WU_STALLED (127 = 0x7f)
10:28:20:WU02:FS01:Starting
10:28:20:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\AJM\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 705 -lifeline 27244 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:28:20:WU02:FS01:Started FahCore on PID 35104
10:28:20:WU02:FS01:Core PID:10208
10:28:20:WU02:FS01:FahCore 0x22 started
10:28:21:WU02:FS01:0x22:*********************** Log Started 2020-04-08T10:28:21Z ***********************
10:28:21:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
10:28:21:WU02:FS01:0x22:       Type: 0x22
10:28:21:WU02:FS01:0x22:       Core: Core22
10:28:21:WU02:FS01:0x22:    Website: https://foldingathome.org/
10:28:21:WU02:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
10:28:21:WU02:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
10:28:21:WU02:FS01:0x22:             <rafal.wiewiora@choderalab.org>
10:28:21:WU02:FS01:0x22:       Args: -dir 02 -suffix 01 -version 705 -lifeline 35104 -checkpoint 15
10:28:21:WU02:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:28:21:WU02:FS01:0x22:     Config: <none>
10:28:21:WU02:FS01:0x22:************************************ Build *************************************
10:28:21:WU02:FS01:0x22:    Version: 0.0.2
10:28:21:WU02:FS01:0x22:       Date: Dec 6 2019
10:28:21:WU02:FS01:0x22:       Time: 21:30:31
10:28:21:WU02:FS01:0x22: Repository: Git
10:28:21:WU02:FS01:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
10:28:21:WU02:FS01:0x22:     Branch: HEAD
10:28:21:WU02:FS01:0x22:   Compiler: Visual C++ 2008
10:28:21:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:28:21:WU02:FS01:0x22:   Platform: win32 10
10:28:21:WU02:FS01:0x22:       Bits: 64
10:28:21:WU02:FS01:0x22:       Mode: Release
10:28:21:WU02:FS01:0x22:************************************ System ************************************
10:28:21:WU02:FS01:0x22:        CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
10:28:21:WU02:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
10:28:21:WU02:FS01:0x22:       CPUs: 64
10:28:21:WU02:FS01:0x22:     Memory: 127.88GiB
10:28:21:WU02:FS01:0x22:Free Memory: 55.59GiB
10:28:21:WU02:FS01:0x22:    Threads: WINDOWS_THREADS
10:28:21:WU02:FS01:0x22: OS Version: 6.2
10:28:21:WU02:FS01:0x22:Has Battery: true
10:28:21:WU02:FS01:0x22: On Battery: false
10:28:21:WU02:FS01:0x22: UTC Offset: 2
10:28:21:WU02:FS01:0x22:        PID: 10208
10:28:21:WU02:FS01:0x22:        CWD: C:\Users\AJM\AppData\Roaming\FAHClient\work
10:28:21:WU02:FS01:0x22:         OS: Windows 10 Pro
10:28:21:WU02:FS01:0x22:    OS Arch: AMD64
10:28:21:WU02:FS01:0x22:********************************************************************************
10:28:21:WU02:FS01:0x22:Project: 11762 (Run 0, Clone 8956, Gen 2)


Since the new installation, the GPU appears "Ready" in FAHControl, but is fully active in GPU-Z.
Any idea, apart from rebooting?
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Constant freezes while GPU folding

Post by ajm »

foldy wrote:Maybe try to disable internet browser gpu hardware acceleration, so it does not compete with FAH.
Thanks! But the GPU stays at 99% even without a browser open, and the freezing disappeared. I'm at a lost...

EDIT: And I DID disable the hardware acceleration in my three browsers. No change.
bronozoj
Posts: 6
Joined: Sun Mar 15, 2020 2:23 pm

Re: Constant freezes while GPU folding

Post by bronozoj »

You could try reinstalling graphics drivers. Some GPU problems appear when the driver does not update properly.

https://www.wagnardsoft.com/content/dis ... 3-released

This is a popular tool for cleaning windows from all the GPU drivers as the usual uninstallers can leave residual files that could corrupt a reinstall.
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Constant freezes while GPU folding

Post by ajm »

Thank you bronozoj!
I downloaded DDU. In the meantime, I had restarted, uninstalled FAH, then AMD's software (Revo Uninstaller), re-restarted, and reinstalled AMD drivers and FAH. I now have a job on the CPU and another one on the GPU, albeit at a very low rhythm on the latter this time. More of "ERROR:Receive error: 997: Overlapped I/O operation is in progress" than anything else. We'll see.

EDIT: Aaaand the freezes are back and the gpu is again stuck at a solid 99%.
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Constant freezes while GPU folding

Post by Joe_H »

Which driver version are you using? There are some reported issues with recent ones that known issues in the readme include problems using F@h and using th eGPU with other software.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Constant freezes while GPU folding

Post by ajm »

The very last available. I always download from AMD or NVIDIA, even if I have the version from the day before on disk.
But in the meantime I gave up, or rather I changed my mind. That card is too problematic in this configuration, obviously. She was perfect on the Intel system, together with the 1080 ti, never a glitch. So I'll put them all together again in my FAH rig and for now I'll use older cards in my main system, waiting for something I'll know will word. A 3080 maybe :wink:

EDIT: That said, now, after getting rid of AMD display drivers and with two small NVIDIA cards, I still have those:

Code: Select all

18:57:53:ERROR:Receive error: 997: Overlapped I/O operation is in progress.
18:58:09:ERROR:Receive error: 997: Overlapped I/O operation is in progress.
18:58:13:ERROR:Receive error: 997: Overlapped I/O operation is in progress.
18:58:15:WU00:FS00:0xa7:Completed 75000 out of 250000 steps (30%)
18:58:21:ERROR:Receive error: 997: Overlapped I/O operation is in progress.
18:58:24:WU01:FS01:0x22:Completed 10000 out of 1000000 steps (1%)
But even as everything is folding, the computer (TRX40) stays smooth and responsive. As other folders already stated, the error seems to appear only when Advanced Control is open.
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Constant freezes while GPU folding

Post by ajm »

For the record, I rebuilt my old system (X299, 5700 XT, 1080 ti) in another case and everything has been folding without a glitch or any other error than "Could not get an assignment" for several hours now.
All that is left from the problems is the code 997 on the TRX40 system and it definitely seems to appear only when Advanced Control is open.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Constant freezes while GPU folding

Post by foldy »

I would say you can ignore code 997 it does not hurt folding.
Post Reply