Work unit stuck at 76% with GPU

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Work unit stuck at 76% with GPU

Postby andreassen51 » Sat Apr 11, 2020 4:33 pm

Hello, my GPU unit stopped working at 79% while I turned away for a minute.

The card is barely more than a year old (RTX 2070). I've done three reboots, tried X-Plane for half an hour. The harddrive isn't full, btw I have four of them, an ssd and nvme so I'm not too bored..

What's boring is getting the GPU to start from 79%. Perhaps I'd better erase all the files and get it working from scratch again?

I've tried to attach my Log.txt but I can't find a button (like the paper clip).

Regards.
andreassen51
 
Posts: 13
Joined: Wed Mar 11, 2020 8:06 pm
Location: Norway

Re: Work unit stuck at 76% with GPU

Postby Neil-B » Sat Apr 11, 2020 4:35 pm

1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent, Quadro K420 1GB, FAH 7.6.13
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro, Quadro M1000M 2GB, FAH 7.6.13
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro, GTX 750Ti 2GB, FAH 7.6.13
Neil-B
 
Posts: 1204
Joined: Sun Mar 22, 2020 6:52 pm
Location: UK

Re: Work unit stuck at 76% with GPU

Postby andreassen51 » Sat Apr 11, 2020 4:49 pm

Code: Select all
*********************** Log Started 2020-04-11T15:09:26Z ***********************
15:09:26:************************* Folding@home Client *************************
15:09:26:        Website: https://foldingathome.org/
15:09:26:      Copyright: (c) 2009-2018 foldingathome.org
15:09:26:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:26:           Args:
15:09:26:         Config: /home/rogera2/config.xml
15:09:26:******************************** Build ********************************
15:09:26:        Version: 7.5.1
15:09:26:           Date: May 12 2018
15:09:26:           Time: 22:51:07
15:09:26:     Repository: Git
15:09:26:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
15:09:26:         Branch: master
15:09:26:       Compiler: GNU 4.4.7 20120313 (Red Hat 4.4.7-18)
15:09:26:        Options: -std=gnu++98 -O3 -funroll-loops
15:09:26:       Platform: linux2 4.14.0-3-amd64
15:09:26:           Bits: 64
15:09:26:           Mode: Release
15:09:26:******************************* System ********************************
15:09:26:            CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:26:         CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:26:           CPUs: 12
15:09:26:         Memory: 62.84GiB
15:09:26:    Free Memory: 7.08GiB
15:09:26:        Threads: POSIX_THREADS
15:09:26:     OS Version: 4.19
15:09:26:    Has Battery: false
15:09:26:     On Battery: false
15:09:26:     UTC Offset: 2
15:09:26:            PID: 13458
15:09:26:            CWD: /home/rogera2
15:09:26:             OS: Linux 4.19.97-gentoo x86_64
15:09:26:        OS Arch: AMD64
15:09:26:           GPUs: 1
15:09:26:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070 Rev. A] M
15:09:26:                 7465
15:09:26:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
15:09:26:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.64
15:09:26:***********************************************************************
15:09:26:<config>
15:09:26:  <!-- Folding Core -->
15:09:26:  <gpu-usage v='80'/>
15:09:26:
15:09:26:  <!-- Slot Control -->
15:09:26:  <power v='LIGHT'/>
15:09:26:
15:09:26:  <!-- User Information -->
15:09:26:  <passkey v='********************************'/>
15:09:26:  <team v='55186'/>
15:09:26:  <user v='andreassen3256'/>
15:09:26:
15:09:26:  <!-- Folding Slots -->
15:09:26:  <slot id='0' type='CPU'>
15:09:26:    <cpus v='8'/>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:  <slot id='1' type='CPU'>
15:09:26:    <cpus v='4'/>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:  <slot id='2' type='GPU'>
15:09:26:    <paused v='true'/>
15:09:26:  </slot>
15:09:26:</config>
15:09:26:Trying to access database...
15:09:26:Successfully acquired database lock
15:09:26:Enabled folding slot 00: PAUSED cpu:8 (by user)
15:09:26:Enabled folding slot 01: PAUSED cpu:4 (by user)
15:09:26:Enabled folding slot 02: PAUSED gpu:0:TU106 [GeForce RTX 2070 Rev. A] M 7465 (by user)
15:09:38:7:127.0.0.1:New Web connection
15:09:47:FS00:Unpaused
15:09:47:FS01:Unpaused
15:09:47:FS02:Unpaused
15:09:47:WU02:FS00:Starting
15:09:47:WU02:FS00:Running FahCore: /opt/foldingathome/FAHCoreWrapper /home/rogera2/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 705 -lifeline 13458 -checkpoint 15 -np 8
15:09:47:WU02:FS00:Started FahCore on PID 13489
15:09:47:WU02:FS00:Core PID:13493
15:09:47:WU02:FS00:FahCore 0xa7 started
15:09:48:WU00:FS01:Starting
15:09:48:WU00:FS01:Running FahCore: /opt/foldingathome/FAHCoreWrapper /home/rogera2/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 13458 -checkpoint 15 -np 4
15:09:48:WU00:FS01:Started FahCore on PID 13504
15:09:48:WU00:FS01:Core PID:13508
15:09:48:WU00:FS01:FahCore 0xa7 started
15:09:48:WU02:FS00:0xa7:*********************** Log Started 2020-04-11T15:09:47Z ***********************
15:09:48:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
15:09:48:WU02:FS00:0xa7:       Type: 0xa7
15:09:48:WU02:FS00:0xa7:       Core: Gromacs
15:09:48:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 705 -lifeline 13489 -checkpoint 15 -np
15:09:48:WU02:FS00:0xa7:             8
15:09:48:WU02:FS00:0xa7:************************************ CBang *************************************
15:09:48:WU02:FS00:0xa7:       Date: Nov 5 2019
15:09:48:WU02:FS00:0xa7:       Time: 06:06:57
15:09:48:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:09:48:WU02:FS00:0xa7:     Branch: master
15:09:48:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:09:48:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU02:FS00:0xa7:       Bits: 64
15:09:48:WU02:FS00:0xa7:       Mode: Release
15:09:48:WU02:FS00:0xa7:************************************ System ************************************
15:09:48:WU02:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:48:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:48:WU02:FS00:0xa7:       CPUs: 12
15:09:48:WU02:FS00:0xa7:     Memory: 62.84GiB
15:09:48:WU02:FS00:0xa7:Free Memory: 6.74GiB
15:09:48:WU02:FS00:0xa7:    Threads: POSIX_THREADS
15:09:48:WU02:FS00:0xa7: OS Version: 4.19
15:09:48:WU02:FS00:0xa7:Has Battery: false
15:09:48:WU02:FS00:0xa7: On Battery: false
15:09:48:WU02:FS00:0xa7: UTC Offset: 2
15:09:48:WU02:FS00:0xa7:        PID: 13493
15:09:48:WU02:FS00:0xa7:        CWD: /mnt/c2/fgscenery/work
15:09:48:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
15:09:48:WU02:FS00:0xa7:    Version: 0.0.18
15:09:48:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:48:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
15:09:48:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
15:09:48:WU02:FS00:0xa7:       Date: Nov 5 2019
15:09:48:WU02:FS00:0xa7:       Time: 06:13:26
15:09:48:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:09:48:WU02:FS00:0xa7:     Branch: master
15:09:48:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:09:48:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU02:FS00:0xa7:       Bits: 64
15:09:48:WU02:FS00:0xa7:       Mode: Release
15:09:48:WU02:FS00:0xa7:************************************ Build *************************************
15:09:48:WU02:FS00:0xa7:       SIMD: avx_256
15:09:48:WU02:FS00:0xa7:********************************************************************************
15:09:48:WU02:FS00:0xa7:Project: 14592 (Run 248, Clone 2, Gen 17)
15:09:48:WU02:FS00:0xa7:Unit: 0x000000120d5262775e7cee4c2b2018e4
15:09:48:WU02:FS00:0xa7:Digital signatures verified
15:09:48:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpi state.cpt -cpt 15 -nt 8
15:09:48:WU02:FS00:0xa7:Steps: first=8500000 total=500000
15:09:48:WU00:FS01:0xa7:*********************** Log Started 2020-04-11T15:09:48Z ***********************
15:09:48:WU00:FS01:0xa7:************************** Gromacs Folding@home Core ***************************
15:09:48:WU00:FS01:0xa7:       Type: 0xa7
15:09:48:WU00:FS01:0xa7:       Core: Gromacs
15:09:48:WU00:FS01:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 13504 -checkpoint 15 -np
15:09:48:WU00:FS01:0xa7:             4
15:09:48:WU00:FS01:0xa7:************************************ CBang *************************************
15:09:48:WU00:FS01:0xa7:       Date: Nov 5 2019
15:09:48:WU00:FS01:0xa7:       Time: 06:06:57
15:09:48:WU00:FS01:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
15:09:48:WU00:FS01:0xa7:     Branch: master
15:09:48:WU00:FS01:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU00:FS01:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
15:09:48:WU00:FS01:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU00:FS01:0xa7:       Bits: 64
15:09:48:WU00:FS01:0xa7:       Mode: Release
15:09:48:WU00:FS01:0xa7:************************************ System ************************************
15:09:48:WU00:FS01:0xa7:        CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
15:09:48:WU00:FS01:0xa7:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
15:09:48:WU00:FS01:0xa7:       CPUs: 12
15:09:48:WU00:FS01:0xa7:     Memory: 62.84GiB
15:09:48:WU00:FS01:0xa7:Free Memory: 6.73GiB
15:09:48:WU00:FS01:0xa7:    Threads: POSIX_THREADS
15:09:48:WU00:FS01:0xa7: OS Version: 4.19
15:09:48:WU00:FS01:0xa7:Has Battery: false
15:09:48:WU00:FS01:0xa7: On Battery: false
15:09:48:WU00:FS01:0xa7: UTC Offset: 2
15:09:48:WU00:FS01:0xa7:        PID: 13508
15:09:48:WU00:FS01:0xa7:        CWD: /mnt/c2/fgscenery/work
15:09:48:WU00:FS01:0xa7:******************************** Build - libFAH ********************************
15:09:48:WU00:FS01:0xa7:    Version: 0.0.18
15:09:48:WU00:FS01:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:09:48:WU00:FS01:0xa7:  Copyright: 2019 foldingathome.org
15:09:48:WU00:FS01:0xa7:   Homepage: https://foldingathome.org/
15:09:48:WU00:FS01:0xa7:       Date: Nov 5 2019
15:09:48:WU00:FS01:0xa7:       Time: 06:13:26
15:09:48:WU00:FS01:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
15:09:48:WU00:FS01:0xa7:     Branch: master
15:09:48:WU00:FS01:0xa7:   Compiler: GNU 8.3.0
15:09:48:WU00:FS01:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
15:09:48:WU00:FS01:0xa7:   Platform: linux2 4.19.0-5-amd64
15:09:48:WU00:FS01:0xa7:       Bits: 64
15:09:48:WU00:FS01:0xa7:       Mode: Release
15:09:48:WU00:FS01:0xa7:************************************ Build *************************************
15:09:48:WU00:FS01:0xa7:       SIMD: avx_256
15:09:48:WU00:FS01:0xa7:********************************************************************************
15:09:48:WU00:FS01:0xa7:Project: 16411 (Run 2958, Clone 3, Gen 19)
15:09:48:WU00:FS01:0xa7:Unit: 0x00000016a8f5c67d5e877201d4d6c5d3
15:09:48:WU00:FS01:0xa7:Digital signatures verified
15:09:48:WU00:FS01:0xa7:Calling: mdrun -s frame19.tpr -o frame19.trr -x frame19.xtc -cpi state.cpt -cpt 15 -nt 4
15:09:48:WU00:FS01:0xa7:Steps: first=4750000 total=250000
15:09:48:WU02:FS00:0xa7:Completed 376742 out of 500000 steps (75%)
15:09:49:WU00:FS01:0xa7:Completed 9612 out of 250000 steps (3%)
15:10:03:WU00:FS01:0xa7:Completed 10000 out of 250000 steps (4%)
15:10:27:Removing old file 'configs/config-20200409-174034.xml'
15:10:27:Saving configuration to config.xml
15:10:27:<config>
15:10:27:  <!-- Folding Core -->
15:10:27:  <gpu-usage v='80'/>
15:10:27:
15:10:27:  <!-- Slot Control -->
15:10:27:  <power v='LIGHT'/>
15:10:27:
15:10:27:  <!-- User Information -->
15:10:27:  <passkey v='********************************'/>
15:10:27:  <team v='55186'/>
15:10:27:  <user v='andreassen3256'/>
15:10:27:
15:10:27:  <!-- Folding Slots -->
15:10:27:  <slot id='0' type='CPU'>
15:10:27:    <cpus v='8'/>
15:10:27:  </slot>
15:10:27:  <slot id='1' type='CPU'>
15:10:27:    <cpus v='4'/>
15:10:27:  </slot>
15:10:27:  <slot id='2' type='GPU'/>
15:10:27:</config>
15:11:10:WU02:FS00:0xa7:Completed 380000 out of 500000 steps (76%)
15:11:49:WU00:FS01:0xa7:Completed 12500 out of 250000 steps (5%)
15:13:19:WU02:FS00:0xa7:Completed 385000 out of 500000 steps (77%)
15:13:39:WU00:FS01:0xa7:Completed 15000 out of 250000 steps (6%)
15:15:25:WU02:FS00:0xa7:Completed 390000 out of 500000 steps (78%)
15:15:27:WU00:FS01:0xa7:Completed 17500 out of 250000 steps (7%)
15:17:18:WU00:FS01:0xa7:Completed 20000 out of 250000 steps (8%)
15:17:33:WU02:FS00:0xa7:Completed 395000 out of 500000 steps (79%)


The rest of the log was gone but the GPU is supposed to be WU01, the CPU slot 1 and 2 are WU00 and WU02 here.
andreassen51
 
Posts: 13
Joined: Wed Mar 11, 2020 8:06 pm
Location: Norway

Re: Work unit stuck at 76% with GPU

Postby andreassen51 » Sat Apr 11, 2020 5:39 pm

I've got it figured, I have the web interface in Firefox (yeah Linux) and had left the power slider to medium.

The power slider didn't change anything for hours. The GPU would randomly stop until I've set the Full Power.

For those who's running 6 GPUs would the Power to Medium run half the GPUs? I think it will run my one GPU half the time to ease up on the electricity. I was about to turn up the heat anyway, we heat electric here in Norway.

Regards.
andreassen51
 
Posts: 13
Joined: Wed Mar 11, 2020 8:06 pm
Location: Norway

Re: Work unit stuck at 76% with GPU

Postby PantherX » Sun Apr 12, 2020 12:37 am

Based on the configuration you posted, I suggest the following:
1) Please only use 1 CPU Slot, i.e. keep the CPU with 8 CPUs and remove the 4 CPU Slot by first setting it to Finish and then once completed, delete that slot. Reason is that you're system is competing with itself for folding causing an overall slow-down in performance
2) You can remove this setting as it doesn't do anything <gpu-usage v='80'/>

The Medium will cause your GPUs to fold but will only use half of your CPUs. If you want your GPU to fold only when idle, you can right-click the GPU Slot in FAHControl and select the option "On Idle".
Last edited by PantherX on Sun Apr 12, 2020 2:47 am, edited 1 time in total.
Reason: Changed completing to competing
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6334
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Work unit stuck at 76% with GPU

Postby andreassen51 » Sun Apr 12, 2020 2:19 am

Yes, I've considered 2) after I've had it up and running. I was waiting for weeks on end for work units and now it seems we're well fed (I can only speak for myself, I'm very well fed).

For the 1) I had a mere four cores back in the days and had problems setting up (did you mean competing). Yes, I can delete the old settings, I've been too lazy and it sounds like a good idea.

I have the web control page in Firefox and right clicking the GPU icon brings up the context menu to "Save as", bookmark and all that so I'll leave all to full, and make do with 8 threads.

[Edit:] There's a When button in plain sight that sets folding to when-idle.
andreassen51
 
Posts: 13
Joined: Wed Mar 11, 2020 8:06 pm
Location: Norway

Re: Work unit stuck at 76% with GPU

Postby PantherX » Sun Apr 12, 2020 2:49 am

Yep, I meant competing. Fixed my post.

In windows, you will have to open Advanced Control (AKA FAHControl) which is an application by itself. By default, you should see the F@H icon in the bottom right of your task-bar (you may need to click on the little arrow to open the over-flow try) and if you right-click it, select the Advanced Control and follow my instructions.
User avatar
PantherX
Site Moderator
 
Posts: 6334
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Work unit stuck at 76% with GPU

Postby andreassen51 » Sun Apr 12, 2020 3:23 am

I've been using Linux for some time and it was a fiddle to set up the CPU cores. I've played with the idea of 3x 4-threads and thusly ended up with 8+4 and the cruft.

I'm quite apt with grepping the log files in place of progress bars. I'll run the folding for now and hope it isn't totally wasteful.
andreassen51
 
Posts: 13
Joined: Wed Mar 11, 2020 8:06 pm
Location: Norway


Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 2 guests

cron