Client crashes constantly running GPU load on Radeon VII

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
Caboose
Posts: 7
Joined: Thu Mar 19, 2020 12:14 am

Client crashes constantly running GPU load on Radeon VII

Post by Caboose »

It's pretty straightforward - if I turn on GPU folding, it crashes somewhere from 5-15 minutes in, every time. No bluescreen, just straight to black. I admittedly had a slightly underpowered power supply, but even changing that hasn't helped. Driver changes, turning off programs, changing BIOS settings, moving RAM around, nothing. It can CPU fold for literal days, nothing else crashes, RAM tests good, and nothing is overheating. Setup is:

GigaByte X4700 Gaming 7 Rev. 1.1 MB
G.Skill Platinum Silver 3600 RAM @ 32GB
Ryzen 7 1700X
Radeon VII with v106 firmware
Latest Win10

Latest log below. It doesn't seem like it's generating logs when it crashes though, because I don't see where it says I turned the GPU on, but I definitely did.

Code: Select all

*********************** Log Started 2020-03-26T01:27:08Z ***********************
01:27:08:************************* Folding@home Client *************************
01:27:08:        Website: https://foldingathome.org/
01:27:08:      Copyright: (c) 2009-2018 foldingathome.org
01:27:08:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:27:08:           Args: --open-web-control
01:27:08:         Config: C:\Users\Caboose\AppData\Roaming\FAHClient\config.xml
01:27:08:******************************** Build ********************************
01:27:08:        Version: 7.5.1
01:27:08:           Date: May 11 2018
01:27:08:           Time: 13:06:32
01:27:08:     Repository: Git
01:27:08:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
01:27:08:         Branch: master
01:27:08:       Compiler: Visual C++ 2008
01:27:08:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:08:       Platform: win32 10
01:27:08:           Bits: 32
01:27:08:           Mode: Release
01:27:08:******************************* System ********************************
01:27:08:            CPU: AMD Ryzen 7 1700X Eight-Core Processor
01:27:08:         CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
01:27:08:           CPUs: 16
01:27:08:         Memory: 31.95GiB
01:27:08:    Free Memory: 26.77GiB
01:27:08:        Threads: WINDOWS_THREADS
01:27:08:     OS Version: 6.2
01:27:08:    Has Battery: false
01:27:08:     On Battery: false
01:27:08:     UTC Offset: -4
01:27:08:            PID: 16168
01:27:08:            CWD: C:\Users\Caboose\AppData\Roaming\FAHClient
01:27:08:             OS: Windows 10 Enterprise
01:27:08:        OS Arch: AMD64
01:27:08:           GPUs: 1
01:27:08:          GPU 0: Bus:12 Slot:0 Func:0 AMD:5 Vega 20 [Radeon VII]
01:27:08:           CUDA: Not detected: Failed to open dynamic library 'nvcuda.dll': The
01:27:08:                 specified module could not be found.
01:27:08:
01:27:08:OpenCL Device 0: Platform:0 Device:0 Bus:12 Slot:0 Compute:1.2 Driver:3004.8
01:27:08:  Win32 Service: false
01:27:08:***********************************************************************
01:27:08:<config>
01:27:08:  <!-- User Information -->
01:27:08:  <user v='Caboose'/>
01:27:08:
01:27:08:  <!-- Folding Slots -->
01:27:08:  <slot id='0' type='CPU'/>
01:27:08:  <slot id='1' type='GPU'>
01:27:08:    <paused v='true'/>
01:27:08:  </slot>
01:27:08:</config>
01:27:08:Trying to access database...
01:27:08:Successfully acquired database lock
01:27:08:Enabled folding slot 00: READY cpu:14
01:27:08:Enabled folding slot 01: PAUSED gpu:0:Vega 20 [Radeon VII] (by user)
01:27:08:WU00:FS00:Starting
01:27:08:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Caboose\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/avx/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 705 -lifeline 16168 -checkpoint 15 -np 14
01:27:08:WU00:FS00:Started FahCore on PID 1876
01:27:08:WU00:FS00:Core PID:7276
01:27:08:WU00:FS00:FahCore 0xa7 started
01:27:08:WU00:FS00:0xa7:*********************** Log Started 2020-03-26T01:27:08Z ***********************
01:27:08:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
01:27:08:WU00:FS00:0xa7:       Type: 0xa7
01:27:08:WU00:FS00:0xa7:       Core: Gromacs
01:27:08:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 1876 -checkpoint 15 -np
01:27:08:WU00:FS00:0xa7:             14
01:27:08:WU00:FS00:0xa7:************************************ CBang *************************************
01:27:08:WU00:FS00:0xa7:       Date: Oct 26 2019
01:27:08:WU00:FS00:0xa7:       Time: 01:38:25
01:27:08:WU00:FS00:0xa7:   Revision: c46a1a011a24143739ac7218c5a435f66777f62f
01:27:08:WU00:FS00:0xa7:     Branch: master
01:27:08:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
01:27:09:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:09:WU00:FS00:0xa7:   Platform: win32 10
01:27:09:WU00:FS00:0xa7:       Bits: 64
01:27:09:WU00:FS00:0xa7:       Mode: Release
01:27:09:WU00:FS00:0xa7:************************************ System ************************************
01:27:09:WU00:FS00:0xa7:        CPU: AMD Ryzen 7 1700X Eight-Core Processor
01:27:09:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
01:27:09:WU00:FS00:0xa7:       CPUs: 16
01:27:09:WU00:FS00:0xa7:     Memory: 31.95GiB
01:27:09:WU00:FS00:0xa7:Free Memory: 26.74GiB
01:27:09:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
01:27:09:WU00:FS00:0xa7: OS Version: 6.2
01:27:09:WU00:FS00:0xa7:Has Battery: false
01:27:09:WU00:FS00:0xa7: On Battery: false
01:27:09:WU00:FS00:0xa7: UTC Offset: -4
01:27:09:WU00:FS00:0xa7:        PID: 7276
01:27:09:WU00:FS00:0xa7:        CWD: C:\Users\Caboose\AppData\Roaming\FAHClient\work
01:27:09:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
01:27:09:WU00:FS00:0xa7:    Version: 0.0.18
01:27:09:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:27:09:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
01:27:09:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
01:27:09:WU00:FS00:0xa7:       Date: Oct 26 2019
01:27:09:WU00:FS00:0xa7:       Time: 01:52:30
01:27:09:WU00:FS00:0xa7:   Revision: c1e3513b1bc0c16013668f2173ee969e5995b38e
01:27:09:WU00:FS00:0xa7:     Branch: master
01:27:09:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
01:27:09:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
01:27:09:WU00:FS00:0xa7:   Platform: win32 10
01:27:09:WU00:FS00:0xa7:       Bits: 64
01:27:09:WU00:FS00:0xa7:       Mode: Release
01:27:09:WU00:FS00:0xa7:************************************ Build *************************************
01:27:09:WU00:FS00:0xa7:       SIMD: avx_256
01:27:09:WU00:FS00:0xa7:********************************************************************************
01:27:09:WU00:FS00:0xa7:Project: 14311 (Run 17, Clone 11, Gen 26)
01:27:09:WU00:FS00:0xa7:Unit: 0x0000001f0002894b5df2b5d25711a2dd
01:27:09:WU00:FS00:0xa7:Digital signatures verified
01:27:09:WU00:FS00:0xa7:Reducing thread count from 14 to 13 to avoid domain decomposition with large prime factor 7
01:27:09:WU00:FS00:0xa7:Reducing thread count from 13 to 12 to avoid domain decomposition by a prime number > 3
01:27:09:WU00:FS00:0xa7:Calling: mdrun -s frame26.tpr -o frame26.trr -cpi state.cpt -cpt 15 -nt 12
01:27:09:WU00:FS00:0xa7:Steps: first=13000000 total=500000
01:27:10:WU00:FS00:0xa7:Completed 292276 out of 500000 steps (58%)
01:27:16:8:127.0.0.1:New Web connection
01:27:27:FS00:Finishing
01:28:42:WU00:FS00:0xa7:Completed 295000 out of 500000 steps (59%)
01:29:30:FS00:Shutting core down
01:29:30:WU00:FS00:0xa7:WARNING:Console control signal 1 on PID 7276
01:29:30:WU00:FS00:0xa7:Exiting, please wait. . .
01:29:31:Clean exit
Caboose
Posts: 7
Joined: Thu Mar 19, 2020 12:14 am

Re: Client crashes constantly running GPU load on Radeon VII

Post by Caboose »

Quick update: If I tell it to wait for idle, GPU folding doesn't appear to crash it. The second the monitors come on though it's back to the same behavior if I tell it to fold anyway.
EXT64
Posts: 323
Joined: Mon Apr 09, 2012 11:54 pm

Re: Client crashes constantly running GPU load on Radeon VII

Post by EXT64 »

Sorry if I missed it, but what AMD drivers do you have and what GPU voltage/clocks/RAM are you at (if not stock)?
Caboose
Posts: 7
Joined: Thu Mar 19, 2020 12:14 am

Re: Client crashes constantly running GPU load on Radeon VII

Post by Caboose »

I have (currently) the latest at 20.3.1, and it does it stock, overclocked, underclocked, doesn't matter. It did it with older drivers as well as the brand new ones. I even put a new power supply in this thing.
Fodir
Posts: 2
Joined: Sun Mar 15, 2020 10:44 pm

Re: Client crashes constantly running GPU load on Radeon VII

Post by Fodir »

My Radeon 7 did the same until i set it in radeon Control panel to max speed on fans.
i had to switch to manual Control to increase the curve of the fans to make them go to 100% speed. by default it's on automatic and it never goes over 75% fan speed which made my card overheat and pc reset itself when the card was reaching 90 degrees.
Rainmaker
Posts: 15
Joined: Sat Mar 28, 2020 5:59 pm
Hardware configuration: AMD Threadripper 3960X underclocked/undervolted to 3.7GHz 1.0125v
Asus RoG Strix TRX40-E Gaming motherboard
32GB DDR4 3600MHz 16,16,16,36 (Samsung B-Die)
Sapphire Vega 56 overclocked/undervolted to 1662MHz 1070mV
Samsung Evo Plus 1TB NVMe SSD
Linux FTW

Re: Client crashes constantly running GPU load on Radeon VII

Post by Rainmaker »

Caboose wrote:I have (currently) the latest at 20.3.1, and it does it stock, overclocked, underclocked, doesn't matter. It did it with older drivers as well as the brand new ones. I even put a new power supply in this thing.
Do you mean 20.4.1? The release notes list a blackscreen/freeze bug when running FaH and hardware accelerated video at the same time. I posted a thread about it just now, after almost updating my Vega 56 driver to the same version (and avoiding it for that very reason).
Caboose
Posts: 7
Joined: Thu Mar 19, 2020 12:14 am

Re: Client crashes constantly running GPU load on Radeon VII

Post by Caboose »

No, 20.3.1 - I'm going to assume the same bug exists here however.
Post Reply