Odd GPU behavior - terminal window + FAHClient

Moderators: Site Moderators, FAHC Science Team

bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Odd GPU behavior - terminal window + FAHClient

Post by bruce »

FAHClient is responsible for uploading results and obtaining a new assignment plus a couple of other little tasks. I wouldn't say that it's doing nothing. Other Daemons reside quietly in the background until you need it's services
SJC_Steve
Posts: 42
Joined: Wed Feb 03, 2021 7:26 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by SJC_Steve »

I was getting OpenCL problem messages so I installed the OpenCL dev files as suggested by gunnarre with this command;
sudo apt install ocl-icd-opencl-dev

And then these commands;
sudo adduser fahclient video
sudo adduser fahclient render

I'm not getting any OpenCL error messages anymore but still only folding on the CPU, not the GPU.

There's one log message that stands out;
18:40:07: <!-- Folding Slot Configuration -->
18:40:07: <gpu v='false'/>

Is this the issue or is there something else I need to do to begin folding on the GPU?
Here the first lines of the log.

Thanks,
Steve

Code: Select all

*********************** Log Started 2021-02-11T18:40:07Z ***********************
18:40:07:******************************* libFAH ********************************
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 20:36:39
18:40:07:       Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:****************************** FAHClient ******************************
18:40:07:        Version: 7.6.21
18:40:07:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:40:07:      Copyright: 2020 foldingathome.org
18:40:07:       Homepage: https://foldingathome.org/
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 20:39:00
18:40:07:       Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:           Args: --child /etc/fahclient/config.xml --run-as fahclient
18:40:07:                 --pid-file=/var/run/fahclient.pid --daemon
18:40:07:         Config: /etc/fahclient/config.xml
18:40:07:******************************** CBang ********************************
18:40:07:           Date: Oct 20 2020
18:40:07:           Time: 18:37:59
18:40:07:       Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
18:40:07:         Branch: master
18:40:07:       Compiler: GNU 8.3.0
18:40:07:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
18:40:07:                 -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
18:40:07:       Platform: linux2 5.8.0-1-amd64
18:40:07:           Bits: 64
18:40:07:           Mode: Release
18:40:07:******************************* System ********************************
18:40:07:            CPU: AMD Ryzen 7 3700X 8-Core Processor
18:40:07:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
18:40:07:           CPUs: 8
18:40:07:         Memory: 15.61GiB
18:40:07:    Free Memory: 14.03GiB
18:40:07:        Threads: POSIX_THREADS
18:40:07:     OS Version: 5.8
18:40:07:    Has Battery: false
18:40:07:     On Battery: false
18:40:07:     UTC Offset: -7
18:40:07:            PID: 1529
18:40:07:            CWD: /var/lib/fahclient
18:40:07:             OS: Linux 5.8.0-43-generic x86_64
18:40:07:        OS Arch: AMD64
18:40:07:           GPUs: 1
18:40:07:          GPU 0: Bus:8 Slot:0 Func:0 NVIDIA:7 GP106 [GeForce GTX 1060 3GB] 3935
18:40:07:  CUDA Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:6.1 Driver:11.2
18:40:07:OpenCL Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:1.2 Driver:460.32
18:40:07:***********************************************************************
18:40:07:<config>
18:40:07:  <!-- Client Control -->
18:40:07:  <fold-anon v='true'/>
18:40:07:
18:40:07:  <!-- Folding Slot Configuration -->
18:40:07:  <gpu v='false'/>
18:40:07:
18:40:07:  <!-- Slot Control -->
18:40:07:  <power v='full'/>
18:40:07:
18:40:07:  <!-- User Information -->
18:40:07:  <user v='SJC_Steve'/>
18:40:07:
18:40:07:  <!-- Folding Slots -->
18:40:07:  <slot id='0' type='CPU'/>
18:40:07:</config>
18:40:07:Trying to access database...
18:40:07:Successfully acquired database lock
18:40:07:FS00:Initialized folding slot 00: cpu:8
18:40:07:WU00:FS00:Starting
18:40:07:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 706 -lifeline 1529 -checkpoint 15 -np 8
18:40:07:WU00:FS00:Started FahCore on PID 1540
18:40:07:WU00:FS00:Core PID:1544
18:40:07:WU00:FS00:FahCore 0xa7 started
18:40:08:WU00:FS00:0xa7:*********************** Log Started 2021-02-11T18:40:07Z ***********************
18:40:08:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
18:40:08:WU00:FS00:0xa7:       Type: 0xa7
18:40:08:WU00:FS00:0xa7:       Core: Gromacs
18:40:08:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 706 -lifeline 1540 -checkpoint 15 -np 8
18:40:08:WU00:FS00:0xa7:************************************ CBang *************************************
18:40:08:WU00:FS00:0xa7:       Date: Nov 27 2019
18:40:08:WU00:FS00:0xa7:       Time: 11:26:54
18:40:08:WU00:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
18:40:08:WU00:FS00:0xa7:     Branch: master
18:40:08:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
18:40:08:WU00:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
18:40:08:WU00:FS00:0xa7:             -fno-pie -fPIC
18:40:08:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
18:40:08:WU00:FS00:0xa7:       Bits: 64
18:40:08:WU00:FS00:0xa7:       Mode: Release
18:40:08:WU00:FS00:0xa7:************************************ System ************************************
18:40:08:WU00:FS00:0xa7:        CPU: AMD Ryzen 7 3700X 8-Core Processor
18:40:08:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
18:40:08:WU00:FS00:0xa7:       CPUs: 8
18:40:08:WU00:FS00:0xa7:     Memory: 15.61GiB
18:40:08:WU00:FS00:0xa7:Free Memory: 14.02GiB
18:40:08:WU00:FS00:0xa7:    Threads: POSIX_THREADS
18:40:08:WU00:FS00:0xa7: OS Version: 5.8
18:40:08:WU00:FS00:0xa7:Has Battery: false
18:40:08:WU00:FS00:0xa7: On Battery: false
18:40:08:WU00:FS00:0xa7: UTC Offset: -7
18:40:08:WU00:FS00:0xa7:        PID: 1544
18:40:08:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
18:40:08:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
18:40:08:WU00:FS00:0xa7:    Version: 0.0.19
18:40:08:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:40:08:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
18:40:08:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
18:40:08:WU00:FS00:0xa7:       Date: Nov 26 2019
18:40:08:WU00:FS00:0xa7:       Time: 00:41:42
18:40:08:WU00:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
18:40:08:WU00:FS00:0xa7:     Branch: master
18:40:08:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
18:40:08:WU00:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
18:40:08:WU00:FS00:0xa7:             -fno-pie
18:40:08:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
18:40:08:WU00:FS00:0xa7:       Bits: 64
18:40:08:WU00:FS00:0xa7:       Mode: Release
18:40:08:WU00:FS00:0xa7:************************************ Build *************************************
18:40:08:WU00:FS00:0xa7:       SIMD: avx_256
18:40:08:WU00:FS00:0xa7:********************************************************************************
18:40:08:WU00:FS00:0xa7:Project: 16927 (Run 22, Clone 223, Gen 97)
18:40:08:WU00:FS00:0xa7:Unit: 0x00000000000000000000000000000000
18:40:08:WU00:FS00:0xa7:Digital signatures verified
18:40:08:WU00:FS00:0xa7:Calling: mdrun -s frame97.tpr -o frame97.trr -cpi state.cpt -cpt 15 -nt 8
18:40:08:WU00:FS00:0xa7:Steps: first=48500000 total=500000
18:40:09:WU00:FS00:0xa7:Completed 439282 out of 500000 steps (87%)
18:40:23:WU00:FS00:0xa7:Completed 440000 out of 500000 steps (88%)
18:41:48:WU00:FS00:0xa7:Completed 445000 out of 500000 steps (89%)
Last edited by Joe_H on Fri Feb 12, 2021 6:42 am, edited 1 time in total.
Reason: added Code tags to log
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Odd GPU behavior - terminal window + FAHClient

Post by Joe_H »

Yes, that is the issue. It appears the client set that flag when initially installed and no usable GPU was detected. Remove that or set the value to true and you should be able to get the GPU slot set up.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
SJC_Steve
Posts: 42
Joined: Wed Feb 03, 2021 7:26 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by SJC_Steve »

Joe_H wrote:Yes, that is the issue. It appears the client set that flag when initially installed and no usable GPU was detected. Remove that or set the value to true and you should be able to get the GPU slot set up.
Joe_H;

How do I do that? I've broken lots of stuff in the past by fumbling around with configuration files.

Thanks,
Steve
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Odd GPU behavior - terminal window + FAHClient

Post by Joe_H »

In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
SJC_Steve
Posts: 42
Joined: Wed Feb 03, 2021 7:26 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by SJC_Steve »

Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
I tried that but the gpu was already there with a value of "false". I went to the field with a value of false and changed it to true, saved it and then rebooted. When I looked again, it was back to false. So I changed the value the other way and modified the config.xml file manually, rebooted and now it's folding on both the cpu and gpu.

Success, now I just need to write all this stuff into a procedure.

Thanks,
Steve
demorgan
Posts: 18
Joined: Wed Dec 16, 2020 10:29 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by demorgan »

Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
To restart the FAHClient easily, you can do:

Code: Select all

sudo service FAHClient restart
If that doesn't work, try:

Code: Select all

sudo pkill -i fah
sudo service FAHClient restart
This will kill all processes with "fah" in either lower or uppercase in them, so it kills anything related to F@H on your machine. Note that it will also kill anything with the string "fah" in the name, period, so although it's a slim chance this will affect anything other than F@H, keep this in mind. The second command restarts the FAHClient service and gets everything running again fresh.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Odd GPU behavior - terminal window + FAHClient

Post by Neil-B »

SJC_Steve wrote:
Joe_H wrote:In FAHControl use the Configure button, select the Expert tab. Under Extra client options add the option gpu and set the value to true. Then okay and save your way out. You may have to restart the FAHClient process, easiest way can be by rebooting.
I tried that but the gpu was already there with a value of "false". I went to the field with a value of false and changed it to true, saved it and then rebooted. When I looked again, it was back to false. So I changed the value the other way and modified the config.xml file manually, rebooted and now it's folding on both the cpu and gpu.

Success, now I just need to write all this stuff into a procedure.

Thanks,
Steve
I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Odd GPU behavior - terminal window + FAHClient

Post by bruce »

Neil-B wrote:I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)
That works.

So does replacing one value with another value except that you MUST also use the "enter" key. Almost everywhere else you type something, it's accepted when you just click the "OK" or whatever else seems sufficient to move on.

I don't know how many times this oddity has bitten me. :evil:
Whompithian
Posts: 39
Joined: Thu Jun 25, 2020 12:40 am

Re: Odd GPU behavior - terminal window + FAHClient

Post by Whompithian »

At least on Ubuntu, issuing the FAHClient command starts another client that burns power but doesn't actually get any work done.

Thanks,
Steve
The second running instance is getting work done. It just uses all of the default configuration values, including folding as Anonymous, and drops the work in an unexpected location, possibly the current working directory. You can specify the desired config file to use and a unique target work directory with command line arguments if you really want two separate clients running with different parameters. As bruce said, however, that mode of operation is neither supported nor necessary, since multiple slots can be configured for a single running client. To learn more about available options, just issue:

Code: Select all

FAHClient --help | less
SJC_Steve
Posts: 42
Joined: Wed Feb 03, 2021 7:26 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by SJC_Steve »

bruce wrote:
Neil-B wrote:I believe you may have to remove the gpu false entry and then add gpu true rather than try to edit the value .. but you got it sorted another way .. great to see you work through the issues .. looking forward to having the guide to refer people to :)
That works.

So does replacing one value with another value except that you MUST also use the "enter" key. Almost everywhere else you type something, it's accepted when you just click the "OK" or whatever else seems sufficient to move on.

I don't know how many times this oddity has bitten me. :evil:
Today's FAHControl behavior using Python3-FAHControl;

In 'Configure / Expert', I changed the gpu attribute from false to true, hit Enter and Save. System reported a crash of FAHControl. I rebooted, this time the Client is folding on both the GPU and CPU. I tried to go in and change gpu = true however the gpu attribute was missing. I added gpu = true, saved and rebooted. No change, it will not accept any value for gpu. When the a new client is loaded, it again comes up with gpu = false and no gpu folding.

I also tried to modify the config.xml to change gpu from false to true, Saved and rebooted but again the value was deleted in the config file.

I seems the default of the gpu attribute in a new Client install comes up false and it must be removed before the Client will use the gpu. Since this happens without FAHControl running it seems the Client is at fault here.

Any thoughts?

Thanks,
Steve
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Odd GPU behavior - terminal window + FAHClient

Post by bruce »

We probably confused you by giving an incomplete description of that option.

In FAHControl, you need to add the option gpu=true, but you do it by doing a ADD and entering gpu in the Name field of the popup menu and true or false in the Value field unless there is already an entry for gpu.

Editing an existing entry requires more care because of the need for the enter key.

In either case, it ends up looking like this in config.xml:
<gpu v="false"/>
We DO NOT recommend manually edit the config. If you happen to in advertently create a syntax error, FAH will simpy stop working and won't tell you why, making it very difficult to fix.
SJC_Steve
Posts: 42
Joined: Wed Feb 03, 2021 7:26 pm

Re: Odd GPU behavior - terminal window + FAHClient

Post by SJC_Steve »

bruce wrote:We probably confused you by giving an incomplete description of that option.

In FAHControl, you need to add the option gpu=true, but you do it by doing a ADD and entering gpu in the Name field of the popup menu and true or false in the Value field unless there is already an entry for gpu.

Editing an existing entry requires more care because of the need for the enter key.

In either case, it ends up looking like this in config.xml:
<gpu v="false"/>
We DO NOT recommend manually edit the config. If you happen to in advertently create a syntax error, FAH will simpy stop working and won't tell you why, making it very difficult to fix.
Yep, that's exactly what I did; Add button, put in Name = gpu and Value = true, then Save.

I then rebooted the PC and again looked at the Configure/Expert and no gpu name or value, I've done it several times with the same result.

I also edited the config.xml file and rebooted, it did the same thing and deleted the <gpu v="true"/> line completely.

Here's some lines from the log file including the config portion. As you can see it matches the FAHControl value line for the gpu - No Name or Value

Code: Select all

23:40:05:******************************* System ********************************
23:40:05:            CPU: AMD Ryzen 7 3700X 8-Core Processor
23:40:05:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
23:40:05:           CPUs: 8
23:40:05:         Memory: 15.61GiB
23:40:05:    Free Memory: 14.05GiB
23:40:05:        Threads: POSIX_THREADS
23:40:05:     OS Version: 5.8
23:40:05:    Has Battery: false
23:40:05:     On Battery: false
23:40:05:     UTC Offset: -7
23:40:05:            PID: 1520
23:40:05:            CWD: /var/lib/fahclient
23:40:05:             OS: Linux 5.8.0-43-generic x86_64
23:40:05:        OS Arch: AMD64
23:40:05:           GPUs: 1
23:40:05:          GPU 0: Bus:8 Slot:0 Func:0 NVIDIA:7 GP106 [GeForce GTX 1060 3GB] 3935
23:40:05:  CUDA Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:6.1 Driver:11.2
23:40:05:OpenCL Device 0: Platform:0 Device:0 Bus:8 Slot:0 Compute:1.2 Driver:460.32
23:40:05:***********************************************************************
23:40:05:<config>
23:40:05:  <!-- Client Control -->
23:40:05:  <fold-anon v='true'/>
23:40:05:
23:40:05:  <!-- Network -->
23:40:05:  <proxy v=':8080'/>
23:40:05:
23:40:05:  <!-- Slot Control -->
23:40:05:  <power v='full'/>
23:40:05:
23:40:05:  <!-- User Information -->
23:40:05:  <user v='SJC_Steve'/>
23:40:05:
23:40:05:  <!-- Folding Slots -->
23:40:05:  <slot id='0' type='CPU'/>
23:40:05:  <slot id='1' type='GPU'>
23:40:05:    <pci-bus v='8'/>
23:40:05:    <pci-slot v='0'/>
23:40:05:  </slot>
23:40:05:</config>
Thanks,
Steve

Mod Edit: Added Code Tags - PantherX
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Odd GPU behavior - terminal window + FAHClient

Post by bruce »

:?: :?:

Maybe the order matters. Mine is at the very top.

<?xml version="1.0"?>
-<config>
<!-- Folding Slot Configuration -->
<gpu v="false"/>

<!-- HTTP Server -->
<allow v="127.0.0.1,192.168.0.0/24"/>

<!-- Network -->
...
-----------------------------------------------------------------------
Maybe it's just that FAH rewrites config and discards redundant statements. The default value for 'gpu' is 'true' so changing it to 'true' is logically the same as removing the 'false' modification from the default.
Whompithian
Posts: 39
Joined: Thu Jun 25, 2020 12:40 am

Re: Odd GPU behavior - terminal window + FAHClient

Post by Whompithian »

bruce wrote:Maybe it's just that FAH rewrites config and discards redundant statements. The default value for 'gpu' is 'true' so changing it to 'true' is logically the same as removing the 'false' modification from the default.
On Linux systems, this is the case. FAHClient regularly normalizes the config file. On first start and anytime a command is sent to the running client that changes the configurable state, such as the "--send-pause" argument, the client updates the config.xml with a normalized version which it also dumps to the log file. Part of the normalization is to remove any parameter that is explicitly set to its default value. The "--help" output from FAHClient lists the default value for most parameters.
Post Reply