added second GPU .. it keeps erroring out...

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

added second GPU .. it keeps erroring out...

Postby scm2000 » Sat Apr 18, 2020 12:51 am

I have been running fine with one GPU in my system (Geforce GTX 1050 ti)
I just added a Geforce GTX 1070..

I have downloaded and installed the latest NVIDIA driver for both of them.

I did a fresh install of FAH... both gpu's are configured with default GPU slots.

The first GPU (the 1070) is running fine on it's first job.
the second GPU (the 1050).. keeps downloading a job.. starting, then getting the same error:
FS02:0x22:ERROR:exception: Illegal value for DeviceIndex: 1
and it dumps the job...

Any ideas?


Log follows----
Code: Select all
23:37:24:FS02:Paused
23:37:33:Saving configuration to config.xml
23:37:33:<config>
23:37:33:  <!-- HTTP Server -->
23:37:33:  <allow v='127.0.0.1 192.168.1.0/24'/>
23:37:33:
23:37:33:  <!-- Network -->
23:37:33:  <proxy v=':8080'/>
23:37:33:
23:37:33:  <!-- Remote Command Server -->
23:37:33:  <password v='***'/>
23:37:33:
23:37:33:  <!-- User Information -->
23:37:33:  <passkey v='********************************'/>
23:37:33:  <team v='41355'/>
23:37:33:  <user v='scm2000'/>
23:37:33:
23:37:33:  <!-- Folding Slots -->
23:37:33:  <slot id='1' type='GPU'/>
23:37:33:  <slot id='2' type='GPU'>
23:37:33:    <paused v='true'/>
23:37:33:  </slot>
23:37:33:</config>
23:37:34:WU01:FS02:Downloading 7.92MiB
23:37:38:WU01:FS02:Download complete
23:37:38:WU01:FS02:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11751 run:0 clone:7091 gen:32 core:0x22 unit:0x000000378ca304e75e6bbbfc4457114c
23:38:11:FS02:Unpaused
23:38:12:WU01:FS02:Starting
23:38:12:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\steph\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 705 -lifeline 10196 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 1 -opencl-device 1 -cuda-device 1 -gpu 1
23:38:12:WU01:FS02:Started FahCore on PID 10064
23:38:12:WU01:FS02:Core PID:5400
23:38:12:WU01:FS02:FahCore 0x22 started
23:38:12:WU01:FS02:0x22:*********************** Log Started 2020-04-17T23:38:12Z ***********************
23:38:12:WU01:FS02:0x22:*************************** Core22 Folding@home Core ***************************
23:38:12:WU01:FS02:0x22:       Type: 0x22
23:38:12:WU01:FS02:0x22:       Core: Core22
23:38:12:WU01:FS02:0x22:    Website: https://foldingathome.org/
23:38:12:WU01:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
23:38:12:WU01:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
23:38:12:WU01:FS02:0x22:             <rafal.wiewiora@choderalab.org>
23:38:12:WU01:FS02:0x22:       Args: -dir 01 -suffix 01 -version 705 -lifeline 10064 -checkpoint 15
23:38:12:WU01:FS02:0x22:             -gpu-vendor nvidia -opencl-platform 1 -opencl-device 1 -cuda-device
23:38:12:WU01:FS02:0x22:             1 -gpu 1
23:38:12:WU01:FS02:0x22:     Config: <none>
23:38:12:WU01:FS02:0x22:************************************ Build *************************************
23:38:12:WU01:FS02:0x22:    Version: 0.0.2
23:38:12:WU01:FS02:0x22:       Date: Dec 6 2019
23:38:12:WU01:FS02:0x22:       Time: 21:30:31
23:38:12:WU01:FS02:0x22: Repository: Git
23:38:12:WU01:FS02:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
23:38:12:WU01:FS02:0x22:     Branch: HEAD
23:38:12:WU01:FS02:0x22:   Compiler: Visual C++ 2008
23:38:12:WU01:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
23:38:12:WU01:FS02:0x22:   Platform: win32 10
23:38:12:WU01:FS02:0x22:       Bits: 64
23:38:12:WU01:FS02:0x22:       Mode: Release
23:38:12:WU01:FS02:0x22:************************************ System ************************************
23:38:12:WU01:FS02:0x22:        CPU: Intel(R) Celeron(R) CPU G3930 @ 2.90GHz
23:38:12:WU01:FS02:0x22:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
23:38:12:WU01:FS02:0x22:       CPUs: 2
23:38:12:WU01:FS02:0x22:     Memory: 7.70GiB
23:38:12:WU01:FS02:0x22:Free Memory: 4.79GiB
23:38:12:WU01:FS02:0x22:    Threads: WINDOWS_THREADS
23:38:12:WU01:FS02:0x22: OS Version: 6.2
23:38:12:WU01:FS02:0x22:Has Battery: false
23:38:12:WU01:FS02:0x22: On Battery: false
23:38:12:WU01:FS02:0x22: UTC Offset: -4
23:38:12:WU01:FS02:0x22:        PID: 5400
23:38:12:WU01:FS02:0x22:        CWD: C:\Users\steph\AppData\Roaming\FAHClient\work
23:38:12:WU01:FS02:0x22:         OS: Windows 10 Pro
23:38:12:WU01:FS02:0x22:    OS Arch: AMD64
23:38:12:WU01:FS02:0x22:********************************************************************************
23:38:12:WU01:FS02:0x22:Project: 11751 (Run 0, Clone 7091, Gen 32)
23:38:12:WU01:FS02:0x22:Unit: 0x000000378ca304e75e6bbbfc4457114c
23:38:12:WU01:FS02:0x22:Reading tar file core.xml
23:38:12:WU01:FS02:0x22:Reading tar file integrator.xml
23:38:12:WU01:FS02:0x22:Reading tar file state.xml
23:38:14:WU01:FS02:0x22:Reading tar file system.xml
23:38:15:WU01:FS02:0x22:Digital signatures verified
23:38:15:WU01:FS02:0x22:Folding@home GPU Core22 Folding@home Core
23:38:15:WU01:FS02:0x22:Version 0.0.2
23:38:20:WU02:FS02:Upload complete
23:38:20:WU02:FS02:Server responded WORK_ACK (400)
23:38:20:WU02:FS02:Cleaning up
23:38:23:WU01:FS02:0x22:ERROR:exception: Illegal value for DeviceIndex: 1
23:38:23:WU01:FS02:0x22:Saving result file ..\logfile_01.txt
23:38:23:WU01:FS02:0x22:Saving result file science.log
23:38:23:WU01:FS02:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
23:38:23:WARNING:WU01:FS02:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
23:38:23:WU01:FS02:Sending unit results: id:01 state:SEND error:FAULTY project:11751 run:0 clone:7091 gen:32 core:0x22 unit:0x000000378ca304e75e6bbbfc4457114c
23:38:23:WU01:FS02:Uploading 2.62KiB to 140.163.4.231
23:38:23:WU01:FS02:Connecting to 140.163.4.231:8080
23:38:24:WU01:FS02:Upload complete
23:38:24:WU01:FS02:Server responded WORK_ACK (400)
23:38:24:WU02:FS02:Connecting to 65.254.110.245:8080
23:38:24:WU01:FS02:Cleaning up
23:38:24:WARNING:WU02:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:38:24:WU02:FS02:Connecting to 18.218.241.186:80
23:38:25:WARNING:WU02:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:38:25:ERROR:WU02:FS02:Exception: Could not get an assignment
23:38:25:WU02:FS02:Connecting to 65.254.110.245:8080
23:38:26:WARNING:WU02:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
23:38:26:WU02:FS02:Connecting to 18.218.241.186:80
23:38:26:WARNING:WU02:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration
23:38:26:ERROR:WU02:FS02:Exception: Could not get an assignment
23:38:31:FS02:Paused


Mod Edit: Added Code Tags - PantherX
scm2000
 
Posts: 26
Joined: Sun Mar 15, 2020 1:13 am

Re: added second GPU .. it keeps erroring out...

Postby scm2000 » Sat Apr 18, 2020 12:54 am

I have additional information... If I look in the task manager.. it is actually the second gpu, the 1050 that is running the job that FAH thinks is running on the 1070.
So it looks like some confusion due to the default setup of the slots...

So do I need to adjust the cuda and opencl slot indexes?
The setup says you may need to do that with a mix of GPU types.
But what goes where?
scm2000
 
Posts: 26
Joined: Sun Mar 15, 2020 1:13 am

Re: added second GPU .. it keeps erroring out...

Postby PantherX » Sat Apr 18, 2020 1:04 am

I would suggest that you set both GPU slots to finish. Once all the WUs are finished and uploaded, uninstall the client selecting the option to delete the data. Reboot your system and then install the latest version which is 7.6.9 and see what happens :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6334
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: added second GPU .. it keeps erroring out...

Postby scm2000 » Sat Apr 18, 2020 1:15 am

did that version just come out? I set the first gpu up with a fresh FAH Monday this last week... I see now there is a new version
scm2000
 
Posts: 26
Joined: Sun Mar 15, 2020 1:13 am

Re: added second GPU .. it keeps erroring out...

Postby PantherX » Sat Apr 18, 2020 1:26 am

Yep, it was released less than 24 hours ago :D
User avatar
PantherX
Site Moderator
 
Posts: 6334
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: added second GPU .. it keeps erroring out...

Postby MeeLee » Sat Apr 18, 2020 1:28 am

Is it using CUDA?
You usually don't need to install an additional driver. 1 driver should recognize all compatible GPUs.
MeeLee
 
Posts: 929
Joined: Tue Feb 19, 2019 11:16 pm

Re: added second GPU .. it keeps erroring out...

Postby scm2000 » Sat Apr 18, 2020 1:51 am

MeeLee wrote:Is it using CUDA?
You usually don't need to install an additional driver. 1 driver should recognize all compatible GPUs.

yes it's just the one driver but I had to install it on the new card.. it didn't go automatically when I plugged it in an booted it up.
scm2000
 
Posts: 26
Joined: Sun Mar 15, 2020 1:13 am


Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 3 guests

cron