Page 1 of 1

Failed to start core: OpenCL device matching slot 3 not foun

Posted: Mon Dec 10, 2018 9:50 pm
by foldy
When I install FAH 7.5.1 on Ubuntu 18 and run with nvidia GPUs then I get the error message
Failed to start core: OpenCL device matching slot x not found, try setting 'opencl-index' manually
Where x is the slot number for each slot

Initial fah config which gives error message and slots do not start

Code: Select all

21:46:16:<config>
21:46:16:  <!-- Folding Slots -->
21:46:16:  <slot id='1' type='GPU'/>
21:46:16:  <slot id='2' type='GPU'/>
21:46:16:  <slot id='3' type='GPU'/>
21:46:16:  <slot id='4' type='GPU'/>
21:46:16:  <slot id='5' type='GPU'/>
21:46:16:  <slot id='6' type='GPU'/>
21:46:16:</config>
When I manually add opencl-index as suggested then GPU folding starts without a problem.

Code: Select all

  <slot id='1' type='GPU'>
    <gpu-index v='1'/>
    <opencl-index v='1'/>
  </slot>
...
Expected behavior: If needed the FAHclient adds the gpu-index and opencl-index automatically.

Do others also have this issue? Is it a bug in FAHclient 7.5.1 on linux?

Code: Select all

root@C.43272:/etc/init.d$ FAHClient                                                                                                                                                 [141/150]
21:56:02:INFO(1):Read GPUs.txt
21:56:02:************************* Folding@home Client *************************
21:56:02:      Website: https://foldingathome.org/
21:56:02:    Copyright: (c) 2009-2018 foldingathome.org
21:56:02:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
21:56:02:         Args:
21:56:02:       Config: /etc/init.d/config.xml
21:56:02:******************************** Build ********************************
21:56:02:      Version: 7.5.1
21:56:02:         Date: May 11 2018
21:56:02:         Time: 19:59:04
21:56:02:   Repository: Git
21:56:02:     Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
21:56:02:       Branch: master
21:56:02:     Compiler: GNU 6.3.0 20170516
21:56:02:      Options: -std=gnu++98 -O3 -funroll-loops
21:56:02:     Platform: linux2 4.14.0-3-amd64
21:56:02:         Bits: 64
21:56:02:         Mode: Release
21:56:02:******************************* System ********************************
21:56:02:          CPU: Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz
21:56:02:       CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
21:56:02:         CPUs: 4
21:56:02:       Memory: 7.74GiB
21:56:02:  Free Memory: 4.10GiB
21:56:02:      Threads: POSIX_THREADS
21:56:02:   OS Version: 4.15
21:56:02:  Has Battery: false
21:56:02:   On Battery: false
21:56:02:   UTC Offset: 0
21:56:02:          PID: 294
21:56:02:          CWD: /etc/init.d
21:56:02:           OS: Linux 4.15.0-29-generic x86_64
21:56:02:      OS Arch: AMD64
21:56:02:         GPUs: 6
21:56:02:        GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:        GPU 1: Bus:2 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:        GPU 2: Bus:3 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:        GPU 3: Bus:4 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:        GPU 4: Bus:6 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:        GPU 5: Bus:7 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
21:56:02:CUDA Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:6.1 Driver:9.0
21:56:02:CUDA Device 1: Platform:0 Device:1 Bus:3 Slot:0 Compute:6.1 Driver:9.0
21:56:02:       OpenCL: Not detected: Failed to open dynamic library 'libOpenCL.so':
21:56:02:               libOpenCL.so: cannot open shared object file: No such file or
21:56:02:               directory
21:56:02:***********************************************************************
21:56:16:<config>
21:56:16:  <!-- Folding Slots -->
21:56:16:  <slot id='1' type='GPU'/>
21:56:16:  <slot id='2' type='GPU'/>
21:56:16:  <slot id='3' type='GPU'/>
21:56:16:  <slot id='4' type='GPU'/>
21:56:16:  <slot id='5' type='GPU'/>
21:56:16:  <slot id='6' type='GPU'/>
21:56:16:</config>
21:56:33:Trying to access database...
21:56:33:Successfully acquired database lock
21:56:33:Enabled folding slot 00: READY cpu:1
21:56:33:Enabled folding slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380
21:56:33:Enabled folding slot 02: READY gpu:1:GP102 [GeForce GTX 1080 Ti] 11380
21:56:33:Enabled folding slot 03: READY gpu:2:GP102 [GeForce GTX 1080 Ti] 11380
21:56:33:Enabled folding slot 04: READY gpu:3:GP102 [GeForce GTX 1080 Ti] 11380
21:56:33:Enabled folding slot 05: READY gpu:4:GP102 [GeForce GTX 1080 Ti] 11380
21:56:33:Enabled folding slot 06: READY gpu:5:GP102 [GeForce GTX 1080 Ti] 11380
[... get work units ...]
21:46:16:WU03:FS03:Starting
^[[91m21:46:16:ERROR:WU03:FS03:Failed to start core: OpenCL device matching slot 3 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU02:FS02:Starting
^[[91m21:46:16:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU04:FS04:Starting
^[[91m21:46:16:ERROR:WU04:FS04:Failed to start core: OpenCL device matching slot 4 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU05:FS05:Starting
^[[91m21:46:16:ERROR:WU05:FS05:Failed to start core: OpenCL device matching slot 5 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU06:FS06:Starting
^[[91m21:46:16:ERROR:WU06:FS06:Failed to start core: OpenCL device matching slot 6 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU01:FS01:Starting
^[[91m21:46:16:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU03:FS03:Starting
^[[91m21:46:16:ERROR:WU03:FS03:Failed to start core: OpenCL device matching slot 3 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU02:FS02:Starting
^[[91m21:46:16:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually^[[0m
21:46:16:WU04:FS04:Starting
^[[91m21:46:16:ERROR:WU04:FS04:Failed to start core: OpenCL device matching slot 4 not found, try setting 'opencl-index' manually^[[0m

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Tue Dec 11, 2018 2:01 pm
by drf99
foldy,

I get the same results, cycling through 5 different versions of the Nvidia drivers. The results are the same in Linux Mint 18.3 and Mint 19.

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Tue Dec 11, 2018 3:16 pm
by foldy
So it is not related to nvidia driver version or Linux version.

There is a workaround: In FahClient config file add the gpu and opencl index manually

<slot id='1' type='GPU'>
<gpu-index v='1'/>
<opencl-index v='1'/>
</slot>

But if this is reproducible by different users on different Linux with different nvidia drivers then I guess this is a FAHClient 7.5.1 bug?

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Wed Dec 12, 2018 1:05 pm
by drf99
It appears that I managed for find my own answer in the following post: viewtopic.php?f=74&t=30993&p=303071&hil ... CL#p303071
Specifically, all I needed to do was

Code: Select all

 sudo apt install ocl-icd-opencl-dev
. With that one installation the OpenCL device was found and I was able to configure for a GPU slot and that slot was utilized overnight.

Hopefully you get similar results.

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Sat Dec 15, 2018 8:29 pm
by foldy
Thank you that works. Funny that without the ocl-icd-opencl-dev it can still fold on the GPUs but needs the index flags set manually.

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Sun Dec 16, 2018 3:02 pm
by toTOW
Yes, something was definitely broken with your OpenCL installation, judging by this message at client startup :
21:56:02: OpenCL: Not detected: Failed to open dynamic library 'libOpenCL.so':
21:56:02: libOpenCL.so: cannot open shared object file: No such file or
21:56:02: directory
It's not the fist time I see this issue begin fixed by "sudo apt install ocl-icd-opencl-dev" ... something probably changed in recent distributions in how the OS links to libraries ...

Re: Failed to start core: OpenCL device matching slot 3 not

Posted: Tue Dec 18, 2018 3:11 am
by bruce
If a suitble link to OpenCL is not found an install time, the client decides there are zero supported GPUs and that's stored in the install files. There have to be GPUs for it to set up suitable values for OpenCL-index.

I wonder if the default semi-permanent setting GPUs-"false" should be removed. Re-trying to search for supported GPUs wouldn't add a lot of overhead if it's only done at client-restart time.