Problem setting up two Nvidia cards

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
steeven9
Posts: 3
Joined: Sat Apr 04, 2020 11:23 am

Problem setting up two Nvidia cards

Post by steeven9 »

Hi all, newbie here.
I'm trying to get my two GPUs, a GTS450 and a GTX970, to work on my Ubuntu Server 18.04 (headless).
I installed the nvidia-driver-390 drivers; both get detected by nvidia-smi but only the GTS450 shows up in FAHControl. Any attempt to add another GPU slot gives the error "No available GPUs".
Moreover, the GTS450 fails everytime it starts to fold.

Any pointers?
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Problem setting up two Nvidia cards

Post by toTOW »

Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
HaloJones
Posts: 920
Joined: Thu Jul 24, 2008 10:16 am

Re: Problem setting up two Nvidia cards

Post by HaloJones »

in a terminal, run nvidia-smi.

look at the card numbers. Change the GPU slot to the number of the 970. The 450 is I think too old to fold and even if not, will produce so few points as to be worthless. failing that, simply remove the GTS450 from the machine.
single 1070

Image
steeven9
Posts: 3
Joined: Sat Apr 04, 2020 11:23 am

Re: Problem setting up two Nvidia cards

Post by steeven9 »

Thank you for the quick reply!
So apparently only the GTS450 is picked up by F@H... setting gpu-index higher than 0 gives an error "GPU not found".

Here's the log (sorry for not posting it right away):

Code: Select all

*********************** Log Started 2020-04-04T13:05:14Z ***********************
13:05:14:************************* Folding@home Client *************************
13:05:14:    Website: http://folding.stanford.edu/
13:05:14:  Copyright: (c) 2009-2014 Stanford University
13:05:14:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:05:14:       Args: --child --lifeline 20252 /etc/fahclient/config.xml --run-as
13:05:14:             fahclient --pid-file=/var/run/fahclient.pid --daemon
13:05:14:     Config: /etc/fahclient/config.xml
13:05:14:******************************** Build ********************************
13:05:14:    Version: 7.4.4
13:05:14:       Date: Mar 4 2014
13:05:14:       Time: 12:02:38
13:05:14:    SVN Rev: 4130
13:05:14:     Branch: fah/trunk/client
13:05:14:   Compiler: GNU 4.4.7
13:05:14:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
13:05:14:             -fno-unsafe-math-optimizations -msse2
13:05:14:   Platform: linux2 3.2.0-1-amd64
13:05:14:       Bits: 64
13:05:14:       Mode: Release
13:05:14:******************************* System ********************************
13:05:14:        CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
13:05:14:     CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
13:05:14:       CPUs: 4
13:05:14:     Memory: 15.61GiB
13:05:14:Free Memory: 289.30MiB
13:05:14:    Threads: POSIX_THREADS
13:05:14: OS Version: 4.15
13:05:14:Has Battery: false
13:05:14: On Battery: false
13:05:14: UTC Offset: 2
13:05:14:        PID: 20254
13:05:14:        CWD: /var/lib/fahclient
13:05:14:         OS: Linux 4.15.0-91-generic x86_64
13:05:14:    OS Arch: AMD64
13:05:14:       GPUs: 1
13:05:14:      GPU 0: NVIDIA:2 GF106 [GeForce GTS 450]
13:05:14:       CUDA: 5.2
13:05:14:CUDA Driver: 9010
13:05:14:***********************************************************************
13:05:14:<config>
13:05:14:  <!-- Client Control -->
13:05:14:  <fold-anon v='true'/>
13:05:14:
13:05:14:  <!-- Folding Slot Configuration -->
13:05:14:  <gpu v='false'/>
13:05:14:
13:05:14:  <!-- HTTP Server -->
13:05:14:  <allow v='192.168.1.72'/>
13:05:14:
13:05:14:  <!-- Network -->
13:05:14:  <proxy v=':8080'/>
13:05:14:
13:05:14:  <!-- Remote Command Server -->
13:05:14:  <command-address v='192.168.1.75'/>
13:05:14:  <password v='********'/>
13:05:14:
13:05:14:  <!-- User Information -->
13:05:14:  <passkey v='********************************'/>
13:05:14:  <team v='225605'/>
13:05:14:  <user v='Steeven'/>
13:05:14:
13:05:14:  <!-- Web Server -->
13:05:14:  <web-allow v='192.168.1.72'/>
13:05:14:
13:05:14:  <!-- Folding Slots -->
13:05:14:  <slot id='0' type='CPU'/>
13:05:14:  <slot id='1' type='GPU'/>
13:05:14:</config>
13:05:14:Switching to user fahclient
13:05:14:Trying to access database...
13:05:17:Successfully acquired database lock
13:05:17:Enabled folding slot 00: READY cpu:2
13:05:17:Enabled folding slot 01: READY gpu:0:GF106 [GeForce GTS 450]
13:05:17:WU01:FS00:Starting
13:05:17:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 01 -suffix 01 -version 704 -lifeline 20254 -checkpoint 15 -np 2
13:05:17:WU01:FS00:Started FahCore on PID 20266
13:05:17:WU01:FS00:Core PID:20270
13:05:17:WU01:FS00:FahCore 0xa7 started
13:05:17:WU00:FS01:Connecting to 65.254.110.245:80
13:05:18:WU01:FS00:0xa7:*********************** Log Started 2020-04-04T13:05:17Z ***********************
13:05:18:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
13:05:18:WU01:FS00:0xa7:       Type: 0xa7
13:05:18:WU01:FS00:0xa7:       Core: Gromacs
13:05:18:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 704 -lifeline 20266 -checkpoint 15 -np
13:05:18:WU01:FS00:0xa7:             2
13:05:18:WU01:FS00:0xa7:************************************ CBang *************************************
13:05:18:WU01:FS00:0xa7:       Date: Nov 5 2019
13:05:18:WU01:FS00:0xa7:       Time: 06:06:57
13:05:18:WU01:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
13:05:18:WU01:FS00:0xa7:     Branch: master
13:05:18:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
13:05:18:WU01:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
13:05:18:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
13:05:18:WU01:FS00:0xa7:       Bits: 64
13:05:18:WU01:FS00:0xa7:       Mode: Release
13:05:18:WU01:FS00:0xa7:************************************ System ************************************
13:05:18:WU01:FS00:0xa7:        CPU: Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz
13:05:18:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
13:05:18:WU01:FS00:0xa7:       CPUs: 4
13:05:18:WU01:FS00:0xa7:     Memory: 15.61GiB
13:05:18:WU01:FS00:0xa7:Free Memory: 519.90MiB
13:05:18:WU01:FS00:0xa7:    Threads: POSIX_THREADS
13:05:18:WU01:FS00:0xa7: OS Version: 4.15
13:05:18:WU01:FS00:0xa7:Has Battery: false
13:05:18:WU01:FS00:0xa7: On Battery: false
13:05:18:WU01:FS00:0xa7: UTC Offset: 2
13:05:18:WU01:FS00:0xa7:        PID: 20270
13:05:18:WU01:FS00:0xa7:        CWD: /var/lib/fahclient/work
13:05:18:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
13:05:18:WU01:FS00:0xa7:    Version: 0.0.18
13:05:18:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:05:18:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
13:05:18:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
13:05:18:WU01:FS00:0xa7:       Date: Nov 5 2019
13:05:18:WU01:FS00:0xa7:       Time: 06:13:26
13:05:18:WU01:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
13:05:18:WU01:FS00:0xa7:     Branch: master
13:05:18:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
13:05:18:WU01:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
13:05:18:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
13:05:18:WU01:FS00:0xa7:       Bits: 64
13:05:18:WU01:FS00:0xa7:       Mode: Release
13:05:18:WU01:FS00:0xa7:************************************ Build *************************************
13:05:18:WU01:FS00:0xa7:       SIMD: avx_256
13:05:18:WU01:FS00:0xa7:********************************************************************************
13:05:18:WU01:FS00:0xa7:Project: 13850 (Run 0, Clone 19381, Gen 14)
13:05:18:WU01:FS00:0xa7:Unit: 0x00000018287234c95e7302b1ccb5608e
13:05:18:WU01:FS00:0xa7:Digital signatures verified
13:05:18:WU01:FS00:0xa7:Calling: mdrun -s frame14.tpr -o frame14.trr -x frame14.xtc -e frame14.edr -cpi state.cpt -cpt 15 -nt 2
13:05:18:WU01:FS00:0xa7:Steps: first=7000000 total=500000
13:05:18:WU01:FS00:0xa7:Completed 359332 out of 500000 steps (71%)
13:05:20:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:80': Empty work server assignment
13:05:20:WU00:FS01:Connecting to 18.218.241.186:80
13:05:21:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': Empty work server assignment
13:05:21:ERROR:WU00:FS01:Exception: Could not get an assignment
13:05:21:WU00:FS01:Connecting to 65.254.110.245:80
13:05:22:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:80': Empty work server assignment
13:05:22:WU00:FS01:Connecting to 18.218.241.186:80
13:05:23:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': Empty work server assignment
13:05:23:ERROR:WU00:FS01:Exception: Could not get an assignment
13:05:46:WU01:FS00:0xa7:Completed 360000 out of 500000 steps (72%)
Rel25917
Posts: 303
Joined: Wed Aug 15, 2012 2:31 am

Re: Problem setting up two Nvidia cards

Post by Rel25917 »

It also looks like you are missing opencl which is required to run on gpus.
steeven9
Posts: 3
Joined: Sat Apr 04, 2020 11:23 am

Re: Problem setting up two Nvidia cards

Post by steeven9 »

After fiddling a bit and multiple restarts, it works! For posterity, I had to:

- uninstall everything nvidia-related I already had
- disable the nouveau driver (https://docs.nvidia.com/cuda/cuda-insta ... le-nouveau)
- install the nvidia CUDA toolkit (https://developer.nvidia.com/cuda-downloads)
- install the opencl headers (sudo apt install ocl-icd-* opencl-headers)
- remove my existing GPU slot in FAHControl, restart it and re-add a slot with index 1 (for the GTX970)

Now let's see if it actually folds :D
Post Reply