Work unit stuck at 76% with GPU

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
andreassen51
Posts: 13
Joined: Wed Mar 11, 2020 7:06 pm
Hardware configuration: Intel 8700, 64Gb RAM, Asus RTX 2070, Noctua cooler
Location: Norway

Work unit stuck at 76% with GPU

Post by andreassen51 »

Hello, my GPU unit stopped working at 79% while I turned away for a minute.

The card is barely more than a year old (RTX 2070). I've done three reboots, tried X-Plane for half an hour. The harddrive isn't full, btw I have four of them, an ssd and nvme so I'm not too bored..

What's boring is getting the GPU to start from 79%. Perhaps I'd better erase all the files and get it working from scratch again?

I've tried to attach my Log.txt but I can't find a button (like the paper clip).

Regards.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Work unit stuck at 76% with GPU

Post by Neil-B »

viewtopic.php?f=24&t=26036 may help with posting logs
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
andreassen51
Posts: 13
Joined: Wed Mar 11, 2020 7:06 pm
Hardware configuration: Intel 8700, 64Gb RAM, Asus RTX 2070, Noctua cooler
Location: Norway

Re: Work unit stuck at 76% with GPU

Post by andreassen51 »

Code: Select all

*********************** Log Started 2020-04-11T15:09:26Z ***********************
15:09:26:************************* Folding@home Client *************************
15:09:26:        Website: https://foldingathome.org/
15:09:26:      Copyright: (c) 2009-2018 foldingathome.org
15:09:26:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:26:           Args: 
15:09:26:         Config: /home/rogera2/config.xml
15:09:26:******************************** Build ********************************
15:09:26:        Version: 7.5.1
15:09:26:           Date: May 12 2018
15:09:26:           Time: 22:51:07
15:09:26:     Repository: Git
15:09:26:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
15:09:26:         Branch: master
15:09:26:       Compiler: GNU 4.4.7 20120313 (Red Hat 4.4.7-18)
15:09:26:        Options: -std=gnu++98 -O3 -funroll-loops
15:09:26:       Platform: linux2 4.14.0-3-amd64
15:09:26:           Bits: 64
15:09:26:           Mode: Release
15:09:26:******************************* System ********************************
15:09:26:            CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:26:         CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:26:           CPUs: 12
15:09:26:         Memory: 62.84GiB
15:09:26:    Free Memory: 7.08GiB
15:09:26:        Threads: POSIX_THREADS
15:09:26:     OS Version: 4.19
15:09:26:    Has Battery: false
15:09:26:     On Battery: false
15:09:26:     UTC Offset: 2
15:09:26:            PID: 13458
15:09:26:            CWD: /home/rogera2
15:09:26:             OS: Linux 4.19.97-gentoo x86_64
15:09:26:        OS Arch: AMD64
15:09:26:           GPUs: 1
15:09:26:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070 Rev. A] M
15:09:26:                 7465
15:09:26:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
15:09:26:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.64
15:09:26:***********************************************************************
15:09:26:<config>
15:09:26:  <!-- Folding Core -->
15:09:26:  <gpu-usage v='80'/>
15:09:26:
15:09:26:  <!-- Slot Control -->
15:09:26:  <power v='LIGHT'/>
15:09:26:
15:09:26:  <!-- User Information -->
15:09:26:  <passkey v='********************************'/>
15:09:26:  <team v='55186'/>
15:09:26:  <user v='andreassen3256'/>
15:09:26:
15:09:26:  <!-- Folding Slots -->
15:09:26:  <slot id='0' type='CPU'>
15:09:26:    <cpus v='8'/>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:  <slot id='1' type='CPU'>
15:09:26:    <cpus v='4'/>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:  <slot id='2' type='GPU'>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:</config>
15:09:26:Trying to access database...
15:09:26:Successfully acquired database lock
15:09:26:Enabled folding slot 00: PAUSED cpu:8 (by user)
15:09:26:Enabled folding slot 01: PAUSED cpu:4 (by user)
15:09:26:Enabled folding slot 02: PAUSED gpu:0:TU106 [GeForce RTX 2070 Rev. A] M 7465 (by user)
15:09:38:7:127.0.0.1:New Web connection
15:09:47:FS00:Unpaused
15:09:47:FS01:Unpaused
15:09:47:FS02:Unpaused
15:09:47:WU02:FS00:Starting
15:09:47:WU02:FS00:Running FahCore: /opt/foldingathome/FAHCoreWrapper /home/rogera2/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 705 -lifeline 13458 -checkpoint 15 -np 8
15:09:47:WU02:FS00:Started FahCore on PID 13489
15:09:47:WU02:FS00:Core PID:13493
15:09:47:WU02:FS00:FahCore 0xa7 started
15:09:48:WU00:FS01:Starting
15:09:48:WU00:FS01:Running FahCore: /opt/foldingathome/FAHCoreWrapper /home/rogera2/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 13458 -checkpoint 15 -np 4
15:09:48:WU00:FS01:Started FahCore on PID 13504
15:09:48:WU00:FS01:Core PID:13508
15:09:48:WU00:FS01:FahCore 0xa7 started
15:09:48:WU02:FS00:0xa7:*********************** Log Started 2020-04-11T15:09:47Z ***********************
15:09:48:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
15:09:48:WU02:FS00:0xa7:       Type: 0xa7
15:09:48:WU02:FS00:0xa7:       Core: Gromacs
15:09:48:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 705 -lifeline 13489 -checkpoint 15 -np
15:09:48:WU02:FS00:0xa7:             8
15:09:48:WU02:FS00:0xa7:************************************ CBang *************************************
15:09:48:WU02:FS00:0xa7:       Date: Nov 5 2019
15:09:48:WU02:FS00:0xa7:       Time: 06:06:57
15:09:48:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:09:48:WU02:FS00:0xa7:     Branch: master
15:09:48:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:09:48:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU02:FS00:0xa7:       Bits: 64
15:09:48:WU02:FS00:0xa7:       Mode: Release
15:09:48:WU02:FS00:0xa7:************************************ System ************************************
15:09:48:WU02:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:48:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:48:WU02:FS00:0xa7:       CPUs: 12
15:09:48:WU02:FS00:0xa7:     Memory: 62.84GiB
15:09:48:WU02:FS00:0xa7:Free Memory: 6.74GiB
15:09:48:WU02:FS00:0xa7:    Threads: POSIX_THREADS
15:09:48:WU02:FS00:0xa7: OS Version: 4.19
15:09:48:WU02:FS00:0xa7:Has Battery: false
15:09:48:WU02:FS00:0xa7: On Battery: false
15:09:48:WU02:FS00:0xa7: UTC Offset: 2
15:09:48:WU02:FS00:0xa7:        PID: 13493
15:09:48:WU02:FS00:0xa7:        CWD: /mnt/c2/fgscenery/work
15:09:48:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
15:09:48:WU02:FS00:0xa7:    Version: 0.0.18
15:09:48:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:48:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
15:09:48:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
15:09:48:WU02:FS00:0xa7:       Date: Nov 5 2019
15:09:48:WU02:FS00:0xa7:       Time: 06:13:26
15:09:48:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:09:48:WU02:FS00:0xa7:     Branch: master
15:09:48:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:09:48:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU02:FS00:0xa7:       Bits: 64
15:09:48:WU02:FS00:0xa7:       Mode: Release
15:09:48:WU02:FS00:0xa7:************************************ Build *************************************
15:09:48:WU02:FS00:0xa7:       SIMD: avx_256
15:09:48:WU02:FS00:0xa7:********************************************************************************
15:09:48:WU02:FS00:0xa7:Project: 14592 (Run 248, Clone 2, Gen 17)
15:09:48:WU02:FS00:0xa7:Unit: 0x000000120d5262775e7cee4c2b2018e4
15:09:48:WU02:FS00:0xa7:Digital signatures verified
15:09:48:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpi state.cpt -cpt 15 -nt 8
15:09:48:WU02:FS00:0xa7:Steps: first=8500000 total=500000
15:09:48:WU00:FS01:0xa7:*********************** Log Started 2020-04-11T15:09:48Z ***********************
15:09:48:WU00:FS01:0xa7:************************** Gromacs Folding@home Core ***************************
15:09:48:WU00:FS01:0xa7:       Type: 0xa7
15:09:48:WU00:FS01:0xa7:       Core: Gromacs
15:09:48:WU00:FS01:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 13504 -checkpoint 15 -np
15:09:48:WU00:FS01:0xa7:             4
15:09:48:WU00:FS01:0xa7:************************************ CBang *************************************
15:09:48:WU00:FS01:0xa7:       Date: Nov 5 2019
15:09:48:WU00:FS01:0xa7:       Time: 06:06:57
15:09:48:WU00:FS01:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:09:48:WU00:FS01:0xa7:     Branch: master
15:09:48:WU00:FS01:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU00:FS01:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:09:48:WU00:FS01:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU00:FS01:0xa7:       Bits: 64
15:09:48:WU00:FS01:0xa7:       Mode: Release
15:09:48:WU00:FS01:0xa7:************************************ System ************************************
15:09:48:WU00:FS01:0xa7:        CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:48:WU00:FS01:0xa7:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:48:WU00:FS01:0xa7:       CPUs: 12
15:09:48:WU00:FS01:0xa7:     Memory: 62.84GiB
15:09:48:WU00:FS01:0xa7:Free Memory: 6.73GiB
15:09:48:WU00:FS01:0xa7:    Threads: POSIX_THREADS
15:09:48:WU00:FS01:0xa7: OS Version: 4.19
15:09:48:WU00:FS01:0xa7:Has Battery: false
15:09:48:WU00:FS01:0xa7: On Battery: false
15:09:48:WU00:FS01:0xa7: UTC Offset: 2
15:09:48:WU00:FS01:0xa7:        PID: 13508
15:09:48:WU00:FS01:0xa7:        CWD: /mnt/c2/fgscenery/work
15:09:48:WU00:FS01:0xa7:******************************** Build - libFAH ********************************
15:09:48:WU00:FS01:0xa7:    Version: 0.0.18
15:09:48:WU00:FS01:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:48:WU00:FS01:0xa7:  Copyright: 2019 foldingathome.org
15:09:48:WU00:FS01:0xa7:   Homepage: https://foldingathome.org/
15:09:48:WU00:FS01:0xa7:       Date: Nov 5 2019
15:09:48:WU00:FS01:0xa7:       Time: 06:13:26
15:09:48:WU00:FS01:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:09:48:WU00:FS01:0xa7:     Branch: master
15:09:48:WU00:FS01:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU00:FS01:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:09:48:WU00:FS01:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU00:FS01:0xa7:       Bits: 64
15:09:48:WU00:FS01:0xa7:       Mode: Release
15:09:48:WU00:FS01:0xa7:************************************ Build *************************************
15:09:48:WU00:FS01:0xa7:       SIMD: avx_256
15:09:48:WU00:FS01:0xa7:********************************************************************************
15:09:48:WU00:FS01:0xa7:Project: 16411 (Run 2958, Clone 3, Gen 19)
15:09:48:WU00:FS01:0xa7:Unit: 0x00000016a8f5c67d5e877201d4d6c5d3
15:09:48:WU00:FS01:0xa7:Digital signatures verified
15:09:48:WU00:FS01:0xa7:Calling: mdrun -s frame19.tpr -o frame19.trr -x frame19.xtc -cpi state.cpt -cpt 15 -nt 4
15:09:48:WU00:FS01:0xa7:Steps: first=4750000 total=250000
15:09:48:WU02:FS00:0xa7:Completed 376742 out of 500000 steps (75%)
15:09:49:WU00:FS01:0xa7:Completed 9612 out of 250000 steps (3%)
15:10:03:WU00:FS01:0xa7:Completed 10000 out of 250000 steps (4%)
15:10:27:Removing old file 'configs/config-20200409-174034.xml'
15:10:27:Saving configuration to config.xml
15:10:27:<config>
15:10:27:  <!-- Folding Core -->
15:10:27:  <gpu-usage v='80'/>
15:10:27:
15:10:27:  <!-- Slot Control -->
15:10:27:  <power v='LIGHT'/>
15:10:27:
15:10:27:  <!-- User Information -->
15:10:27:  <passkey v='********************************'/>
15:10:27:  <team v='55186'/>
15:10:27:  <user v='andreassen3256'/>
15:10:27:
15:10:27:  <!-- Folding Slots -->
15:10:27:  <slot id='0' type='CPU'>
15:10:27:    <cpus v='8'/>
15:10:27:  </slot>
15:10:27:  <slot id='1' type='CPU'>
15:10:27:    <cpus v='4'/>
15:10:27:  </slot>
15:10:27:  <slot id='2' type='GPU'/>
15:10:27:</config>
15:11:10:WU02:FS00:0xa7:Completed 380000 out of 500000 steps (76%)
15:11:49:WU00:FS01:0xa7:Completed 12500 out of 250000 steps (5%)
15:13:19:WU02:FS00:0xa7:Completed 385000 out of 500000 steps (77%)
15:13:39:WU00:FS01:0xa7:Completed 15000 out of 250000 steps (6%)
15:15:25:WU02:FS00:0xa7:Completed 390000 out of 500000 steps (78%)
15:15:27:WU00:FS01:0xa7:Completed 17500 out of 250000 steps (7%)
15:17:18:WU00:FS01:0xa7:Completed 20000 out of 250000 steps (8%)
15:17:33:WU02:FS00:0xa7:Completed 395000 out of 500000 steps (79%)
The rest of the log was gone but the GPU is supposed to be WU01, the CPU slot 1 and 2 are WU00 and WU02 here.
andreassen51
Posts: 13
Joined: Wed Mar 11, 2020 7:06 pm
Hardware configuration: Intel 8700, 64Gb RAM, Asus RTX 2070, Noctua cooler
Location: Norway

Re: Work unit stuck at 76% with GPU

Post by andreassen51 »

I've got it figured, I have the web interface in Firefox (yeah Linux) and had left the power slider to medium.

The power slider didn't change anything for hours. The GPU would randomly stop until I've set the Full Power.

For those who's running 6 GPUs would the Power to Medium run half the GPUs? I think it will run my one GPU half the time to ease up on the electricity. I was about to turn up the heat anyway, we heat electric here in Norway.

Regards.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Work unit stuck at 76% with GPU

Post by PantherX »

Based on the configuration you posted, I suggest the following:
1) Please only use 1 CPU Slot, i.e. keep the CPU with 8 CPUs and remove the 4 CPU Slot by first setting it to Finish and then once completed, delete that slot. Reason is that you're system is competing with itself for folding causing an overall slow-down in performance
2) You can remove this setting as it doesn't do anything <gpu-usage v='80'/>

The Medium will cause your GPUs to fold but will only use half of your CPUs. If you want your GPU to fold only when idle, you can right-click the GPU Slot in FAHControl and select the option "On Idle".
Last edited by PantherX on Sun Apr 12, 2020 1:47 am, edited 1 time in total.
Reason: Changed completing to competing
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
andreassen51
Posts: 13
Joined: Wed Mar 11, 2020 7:06 pm
Hardware configuration: Intel 8700, 64Gb RAM, Asus RTX 2070, Noctua cooler
Location: Norway

Re: Work unit stuck at 76% with GPU

Post by andreassen51 »

Yes, I've considered 2) after I've had it up and running. I was waiting for weeks on end for work units and now it seems we're well fed (I can only speak for myself, I'm very well fed).

For the 1) I had a mere four cores back in the days and had problems setting up (did you mean competing). Yes, I can delete the old settings, I've been too lazy and it sounds like a good idea.

I have the web control page in Firefox and right clicking the GPU icon brings up the context menu to "Save as", bookmark and all that so I'll leave all to full, and make do with 8 threads.

[Edit:] There's a When button in plain sight that sets folding to when-idle.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Work unit stuck at 76% with GPU

Post by PantherX »

Yep, I meant competing. Fixed my post.

In windows, you will have to open Advanced Control (AKA FAHControl) which is an application by itself. By default, you should see the F@H icon in the bottom right of your task-bar (you may need to click on the little arrow to open the over-flow try) and if you right-click it, select the Advanced Control and follow my instructions.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
andreassen51
Posts: 13
Joined: Wed Mar 11, 2020 7:06 pm
Hardware configuration: Intel 8700, 64Gb RAM, Asus RTX 2070, Noctua cooler
Location: Norway

Re: Work unit stuck at 76% with GPU

Post by andreassen51 »

I've been using Linux for some time and it was a fiddle to set up the CPU cores. I've played with the idea of 3x 4-threads and thusly ended up with 8+4 and the cruft.

I'm quite apt with grepping the log files in place of progress bars. I'll run the folding for now and hope it isn't totally wasteful.
Post Reply