My GPU keeps failing workunits

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

My GPU keeps failing workunits

Postby MG20 » Sat Apr 18, 2020 1:32 am

Hey, I'm new to folding. I've done a few cpu workloads so far and they've worked fine, but my GPU just keeps failing after only a couple percent.
GPU: Radeon HD 5870. I haven't manually overlocked it, but it might be slightly overclocked from factory?
I think I'm running on the latest drivers, they're the ones windows 10 downloaded.
It always takes a few tries to get a work unit, but it eventually gets one and starts, runs for a bit, then just fails.

Here's the log. It only says it made it to 0%, but FAHControl showed a little over 1% before it failed.

Code: Select all
*********************** Log Started 2020-04-17T15:31:44Z ***********************
******************************* Date: 2020-04-17 *******************************
23:41:39:FS01:Unpaused
23:41:39:WU01:FS01:Connecting to 65.254.110.245:8080
23:41:40:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:41:40:WU01:FS01:Connecting to 18.218.241.186:80
23:41:40:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:41:40:ERROR:WU01:FS01:Exception: Could not get an assignment
23:41:40:WU01:FS01:Connecting to 65.254.110.245:8080
23:41:41:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:41:41:WU01:FS01:Connecting to 18.218.241.186:80
23:41:41:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:41:41:ERROR:WU01:FS01:Exception: Could not get an assignment
23:42:40:WU01:FS01:Connecting to 65.254.110.245:8080
23:42:41:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:42:41:WU01:FS01:Connecting to 18.218.241.186:80
23:42:41:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:42:41:ERROR:WU01:FS01:Exception: Could not get an assignment
23:44:18:WU01:FS01:Connecting to 65.254.110.245:8080
23:44:18:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:44:18:WU01:FS01:Connecting to 18.218.241.186:80
23:44:18:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:44:18:ERROR:WU01:FS01:Exception: Could not get an assignment
23:46:55:WU01:FS01:Connecting to 65.254.110.245:8080
23:46:55:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:46:55:WU01:FS01:Connecting to 18.218.241.186:80
23:46:55:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:46:55:ERROR:WU01:FS01:Exception: Could not get an assignment
23:47:24:FS01:Paused
23:47:30:FS01:Unpaused
23:47:55:WU01:FS01:Connecting to 65.254.110.245:8080
23:47:55:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:47:55:WU01:FS01:Connecting to 18.218.241.186:80
23:47:55:WARNING:WU01:FS01:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:47:55:ERROR:WU01:FS01:Exception: Could not get an assignment
23:49:32:WU01:FS01:Connecting to 65.254.110.245:8080
23:49:32:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:49:32:WU01:FS01:Connecting to 18.218.241.186:80
23:49:33:WU01:FS01:Assigned to work server 128.252.203.10
23:49:33:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Cypress [Radeon HD 5800/6800] from 128.252.203.10
23:49:33:WU01:FS01:Connecting to 128.252.203.10:8080
23:50:32:WU01:FS01:Downloading 51.20MiB
23:50:38:WU01:FS01:Download 2.69%
23:50:44:WU01:FS01:Download 5.01%
23:50:50:WU01:FS01:Download 8.18%
23:50:57:WU01:FS01:Download 11.60%
23:51:03:WU01:FS01:Download 14.89%
23:51:09:WU01:FS01:Download 19.17%
23:51:15:WU01:FS01:Download 21.85%
23:51:21:WU01:FS01:Download 24.42%
23:51:27:WU01:FS01:Download 28.20%
23:51:33:WU01:FS01:Download 31.37%
23:51:39:WU01:FS01:Download 34.30%
23:51:45:WU01:FS01:Download 36.75%
23:51:51:WU01:FS01:Download 38.94%
23:51:57:WU01:FS01:Download 41.39%
23:52:03:WU01:FS01:Download 44.07%
23:52:09:WU01:FS01:Download 47.25%
23:52:15:WU01:FS01:Download 50.05%
23:52:21:WU01:FS01:Download 53.35%
23:52:27:WU01:FS01:Download 56.28%
23:52:33:WU01:FS01:Download 59.70%
23:52:39:WU01:FS01:Download 63.36%
23:52:45:WU01:FS01:Download 65.80%
23:52:51:WU01:FS01:Download 68.49%
23:52:57:WU01:FS01:Download 72.39%
23:53:03:WU01:FS01:Download 75.81%
23:53:09:WU01:FS01:Download 78.38%
23:53:15:WU01:FS01:Download 81.43%
23:53:21:WU01:FS01:Download 84.36%
23:53:27:WU01:FS01:Download 86.80%
23:53:33:WU01:FS01:Download 89.61%
23:53:39:WU01:FS01:Download 92.66%
23:53:45:WU01:FS01:Download 97.18%
23:53:51:WU01:FS01:Download 99.86%
23:53:51:WU01:FS01:Download complete
23:53:51:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11763 run:0 clone:5506 gen:44 core:0x22 unit:0x0000005380fccb0a5e71137b20156547
23:53:51:WU01:FS01:Starting
23:53:51:WU01:FS01:Running FahCore: G:\FoldingAtHome\FAHClient/FAHCoreWrapper.exe G:\FoldingAtHome\Data\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 705 -lifeline 5448 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
23:53:51:WU01:FS01:Started FahCore on PID 228
23:53:51:WU01:FS01:Core PID:2732
23:53:51:WU01:FS01:FahCore 0x22 started
23:53:52:WU01:FS01:0x22:*********************** Log Started 2020-04-17T23:53:52Z ***********************
23:53:52:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
23:53:52:WU01:FS01:0x22:       Type: 0x22
23:53:52:WU01:FS01:0x22:       Core: Core22
23:53:52:WU01:FS01:0x22:    Website: https://foldingathome.org/
23:53:52:WU01:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
23:53:52:WU01:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
23:53:52:WU01:FS01:0x22:             <rafal.wiewiora@choderalab.org>
23:53:52:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 705 -lifeline 228 -checkpoint 15
23:53:52:WU01:FS01:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
23:53:52:WU01:FS01:0x22:     Config: <none>
23:53:52:WU01:FS01:0x22:************************************ Build *************************************
23:53:52:WU01:FS01:0x22:    Version: 0.0.2
23:53:52:WU01:FS01:0x22:       Date: Dec 6 2019
23:53:52:WU01:FS01:0x22:       Time: 21:30:31
23:53:52:WU01:FS01:0x22: Repository: Git
23:53:52:WU01:FS01:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
23:53:52:WU01:FS01:0x22:     Branch: HEAD
23:53:52:WU01:FS01:0x22:   Compiler: Visual C++ 2008
23:53:52:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
23:53:52:WU01:FS01:0x22:   Platform: win32 10
23:53:52:WU01:FS01:0x22:       Bits: 64
23:53:52:WU01:FS01:0x22:       Mode: Release
23:53:52:WU01:FS01:0x22:************************************ System ************************************
23:53:52:WU01:FS01:0x22:        CPU: AMD FX(tm)-6350 Six-Core Processor
23:53:52:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
23:53:52:WU01:FS01:0x22:       CPUs: 6
23:53:52:WU01:FS01:0x22:     Memory: 7.90GiB
23:53:52:WU01:FS01:0x22:Free Memory: 3.61GiB
23:53:52:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
23:53:52:WU01:FS01:0x22: OS Version: 6.2
23:53:52:WU01:FS01:0x22:Has Battery: false
23:53:52:WU01:FS01:0x22: On Battery: false
23:53:52:WU01:FS01:0x22: UTC Offset: -4
23:53:52:WU01:FS01:0x22:        PID: 2732
23:53:52:WU01:FS01:0x22:        CWD: G:\FoldingAtHome\Data\work
23:53:52:WU01:FS01:0x22:         OS: Windows 10 Pro
23:53:52:WU01:FS01:0x22:    OS Arch: AMD64
23:53:52:WU01:FS01:0x22:********************************************************************************
23:53:52:WU01:FS01:0x22:Project: 11763 (Run 0, Clone 5506, Gen 44)
23:53:52:WU01:FS01:0x22:Unit: 0x0000005380fccb0a5e71137b20156547
23:53:52:WU01:FS01:0x22:Reading tar file core.xml
23:53:52:WU01:FS01:0x22:Reading tar file integrator.xml
23:53:52:WU01:FS01:0x22:Reading tar file state.xml
23:53:53:WU01:FS01:0x22:Reading tar file system.xml
23:53:54:WU01:FS01:0x22:Digital signatures verified
23:53:54:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
23:53:54:WU01:FS01:0x22:Version 0.0.2
23:54:21:WU01:FS01:0x22:Completed 0 out of 1000000 steps (0%)
23:54:21:WU01:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
00:00:08:FS01:Finishing
00:04:56:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
00:04:56:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
00:08:33:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
00:08:33:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
00:11:27:WU01:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
00:11:27:WU01:FS01:0x22:Following exception occured: Particle coordinate is nan
00:11:27:WU01:FS01:0x22:ERROR:114: Max Retries Reached
00:11:27:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
00:11:27:WU01:FS01:0x22:Saving result file badstate-0.xml
00:11:27:WU01:FS01:0x22:Saving result file badstate-1.xml
00:11:27:WU01:FS01:0x22:Saving result file badstate-2.xml
00:11:27:WU01:FS01:0x22:Saving result file checkpt.crc
00:11:27:WU01:FS01:0x22:Saving result file science.log
00:11:27:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
00:11:28:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
00:11:28:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11763 run:0 clone:5506 gen:44 core:0x22 unit:0x0000005380fccb0a5e71137b20156547
00:11:29:WU01:FS01:Uploading 35.86MiB to 128.252.203.10
00:11:29:WU01:FS01:Connecting to 128.252.203.10:8080
00:12:10:WU01:FS01:Upload 0.87%
00:12:16:WU01:FS01:Upload 2.96%
00:12:22:WU01:FS01:Upload 4.71%
00:12:28:WU01:FS01:Upload 6.80%
00:12:34:WU01:FS01:Upload 8.71%
00:12:40:WU01:FS01:Upload 10.46%
00:12:46:WU01:FS01:Upload 12.20%
00:12:52:WU01:FS01:Upload 14.12%
00:12:58:WU01:FS01:Upload 15.51%
00:13:04:WU01:FS01:Upload 17.25%
00:13:10:WU01:FS01:Upload 19.17%
00:13:16:WU01:FS01:Upload 20.91%
00:13:22:WU01:FS01:Upload 22.66%
00:13:28:WU01:FS01:Upload 24.57%
00:13:34:WU01:FS01:Upload 25.62%
00:13:40:WU01:FS01:Upload 27.54%
00:13:46:WU01:FS01:Upload 29.28%
00:13:52:WU01:FS01:Upload 31.02%
00:13:58:WU01:FS01:Upload 32.77%
00:14:04:WU01:FS01:Upload 34.68%
00:14:10:WU01:FS01:Upload 36.43%
00:14:16:WU01:FS01:Upload 38.34%
00:14:22:WU01:FS01:Upload 40.09%
00:14:28:WU01:FS01:Upload 41.83%
00:14:34:WU01:FS01:Upload 43.75%
00:14:40:WU01:FS01:Upload 45.49%
00:14:46:WU01:FS01:Upload 47.23%
00:14:52:WU01:FS01:Upload 49.15%
00:14:58:WU01:FS01:Upload 50.89%
00:15:04:WU01:FS01:Upload 52.81%
00:15:10:WU01:FS01:Upload 54.55%
00:15:16:WU01:FS01:Upload 56.47%
00:15:22:WU01:FS01:Upload 58.21%
00:15:28:WU01:FS01:Upload 59.96%
00:15:34:WU01:FS01:Upload 61.87%
00:15:40:WU01:FS01:Upload 63.62%
00:15:46:WU01:FS01:Upload 65.36%
00:15:52:WU01:FS01:Upload 67.28%
00:15:58:WU01:FS01:Upload 68.32%
00:16:04:WU01:FS01:Upload 69.89%
00:16:10:WU01:FS01:Upload 71.81%
00:16:16:WU01:FS01:Upload 73.55%
00:16:24:WU01:FS01:Upload 75.29%
00:16:30:WU01:FS01:Upload 77.38%
00:16:36:WU01:FS01:Upload 79.30%
00:16:42:WU01:FS01:Upload 81.04%
00:16:48:WU01:FS01:Upload 82.79%
00:16:54:WU01:FS01:Upload 84.53%
00:17:00:WU01:FS01:Upload 86.45%
00:17:06:WU01:FS01:Upload 88.02%
00:17:12:WU01:FS01:Upload 89.93%
00:17:18:WU01:FS01:Upload 91.68%
00:17:24:WU01:FS01:Upload 93.42%
00:17:30:WU01:FS01:Upload 95.34%
00:17:36:WU01:FS01:Upload 96.90%
00:17:42:WU01:FS01:Upload 98.65%
00:17:48:FS01:Paused
00:17:49:WU01:FS01:Upload complete
00:17:49:WU01:FS01:Server responded WORK_ACK (400)
00:17:49:WU01:FS01:Cleaning up
MG20
 
Posts: 2
Joined: Sat Apr 18, 2020 1:22 am

Re: My GPU keeps failing workunits

Postby PantherX » Sat Apr 18, 2020 1:39 am

Welcome to the F@H Forum MG20,

I would suggest that you reduce the GPU frequencies to the stock AMD values using MSI Afterburner or EVGA Precision X1. Once that's done, see if the GPU is stable or not.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6605
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: My GPU keeps failing workunits

Postby MG20 » Sat Apr 18, 2020 5:01 am

PantherX wrote:Welcome to the F@H Forum MG20,

I would suggest that you reduce the GPU frequencies to the stock AMD values using MSI Afterburner or EVGA Precision X1. Once that's done, see if the GPU is stable or not.

I turned it down from the default 870 to the stock AMD 850, and it still failed with the same error. Also, changing the clock speeds seems to cause artifacts on my screen.
MG20
 
Posts: 2
Joined: Sat Apr 18, 2020 1:22 am

Re: My GPU keeps failing workunits

Postby PantherX » Sat Apr 18, 2020 5:12 am

MG20 wrote:...Also, changing the clock speeds seems to cause artifacts on my screen.

That's not a good sign at all :( It could mean a voltage issue with your GPU or something else with it.

Until you can either figure it out or replace it, I would suggest that you you temporarily stop folding on the GPU to prevent WU duplication since you currently have a unstable GPU.
User avatar
PantherX
Site Moderator
 
Posts: 6605
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: My GPU keeps failing workunits

Postby Rel25917 » Sat Apr 18, 2020 6:34 am

I would try drivers from the manufacturer. If it still fails it could be a bad card.
Rel25917
 
Posts: 299
Joined: Wed Aug 15, 2012 3:31 am

Re: My GPU keeps failing workunits

Postby Joe_H » Sat Apr 18, 2020 6:57 am

My experience with 5870's when they were current, was that they tended to run hot and need effective cooling. The heat transfer pads or compound also tended to dry out over time, so a 9-10 year old card might need reapplication to stay cool.

Downclocking the video RAM may also help, and will not effect folding speed by much at all.

If you are seeing video artifacts, that is pointing towards a 5870 that is overheating.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6547
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: My GPU keeps failing workunits

Postby Jan » Sat Apr 18, 2020 11:52 am

Try for example GPU-Z to track your temps. Its quite accurate.
Jan
 
Posts: 80
Joined: Tue Mar 31, 2020 7:46 pm

Re: My GPU keeps failing workunits

Postby Houd.ini » Tue Apr 21, 2020 1:18 pm

I just wanted to chip in that I have experienced the same thing on two similar cards.
I have two HD 5850s, and they both kept failing intermittently, neither card got particularly hot, 75 and 80 C max. The fans on the cards weren't running particularly fast either (around 13-1400 rpm), so there was more cooling available.
I tried pulling one card out and changing them around, but still the remaining one kept failing. It's a few weeks since I stopped folding on them, but I think (cannot say for sure though) they were more prone to failing on the latest Win 10 driver available on AMD's website than the one Win 10 auto-installed. When they folded their estimated PPD was around 25k. I think they uploaded finished WUs.
They weren't pulling much power either, I have a power meter at the wall and a rig with the 2 5850s and a stock Core i5 3570k/1 SSD/Z77 motherboard only pulled 250w at the wall with both cards and CPU folding. The cards have 150w TDPs, the CPU only pulls around 50w folding on two of the four available threads. I have a spare rig and parts available, I am thinking of water cooling the cards with leftover parts from the old days of mining Bitcoins, that should settle the temperature suspicion at least.
Houd.ini
 
Posts: 10
Joined: Sat Apr 11, 2020 1:26 am

Re: My GPU keeps failing workunits

Postby Joe_H » Wed Apr 22, 2020 1:16 am

One issue you can run into with using the "latest" driver, AMD dropped support for OpenCL for the older cards the driver would support. You may need to go back to an older version that still supports OpenCL on the Terascale based 5800 series, and I came across a report that also happens to the Terascale based 6900 series cards.
Joe_H
Site Admin
 
Posts: 6547
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: My GPU keeps failing workunits

Postby PantherX » Wed Apr 22, 2020 7:50 am

Welcome to the F@H Forum Houd.ini,

When it comes to problems folding, it would be nice to have the log file to see what's going on. If you require guidance, here's a topic providing information on how to post log file: viewtopic.php?f=24&t=26036
User avatar
PantherX
Site Moderator
 
Posts: 6605
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud


Return to New Donors start here

Who is online

Users browsing this forum: Yandex [Bot] and 2 guests

cron