OpenCL: Not detected: clGetDeviceIDs() returned -1

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

OpenCL: Not detected: clGetDeviceIDs() returned -1

Postby MarcvdM » Mon Apr 06, 2020 9:30 am

Hello,

I run FAHClient on a lenovo p53 ,NVidia T1000 mobile. Fedora 31 install. Nvidia driver 440.64 is installed (dkms).
The FAH client installer did not come with /var/lib/fahclient/GPUs.txt, so I installed that manually. The T1000 is in the list, and gets reported on in log.txt:

Code: Select all
08:06:02:          CWD: /var/lib/fahclient
08:06:02:           OS: Linux 5.5.13-200.fc31.x86_64 x86_64
08:06:02:      OS Arch: AMD64
08:06:02:         GPUs: 1
08:06:02:        GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU117GLM [Quadro T1000 Mobile]
08:06:02:CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
08:06:02:       OpenCL: Not detected: clGetDeviceIDs() returned -1
08:06:02:***********************************************************************


I do not understand why method clGetDeviceIDs() returns "-1", where "clinfo" has no issues with it:

Code: Select all
NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  NVIDIA CUDA
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [NV]
  clCreateContext(NULL, ...) [default]            Success [NV]


in this forum I found an entry that opencl-utils-dev needs to be installed, so installed that. packages matching opencl:
opencl-filesystem-1.0-10.fc31.noarch
opencl-utils-1-10.svn16.fc31.x86_64
opencl-utils-devel-1-10.svn16.fc31.x86_64
opencl-headers-2.2-5.20190205git49f07d3.fc31.noarch

Any hint on how to enable the GPU for folding would be greatly appreciated

thanks,
Marc
MarcvdM
 
Posts: 14
Joined: Mon Apr 06, 2020 8:02 am

Re: OpenCL: Not detected: clGetDeviceIDs() returned -1

Postby MarcvdM » Mon Apr 06, 2020 12:03 pm

OK _ one step closer now. I decided to crudely hack /etc/OpenCL/vendors, and leave nvidia as only valid .icd file. lo and behold, FAHClient finally recognizes the GPU:

Code: Select all
0:58:27:            CWD: /var/lib/fahclient
10:58:27:             OS: Linux 5.5.13-200.fc31.x86_64 x86_64
10:58:27:        OS Arch: AMD64
10:58:27:           GPUs: 1
10:58:27:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU117GLM [Quadro T1000 Mobile]
10:58:27:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
10:58:27:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.64
10:58:27:***********************************************************************


The only problem remaining is that the config.xml is not generated correctly:

Code: Select all
10:58:27:  <!-- Folding Slot Configuration -->
10:58:27:  <gpu v='false'/>
10:58:27:
10:58:27:  <!-- Slot Control -->
10:58:27:  <power v='medium'/>
10:58:27:
10:58:27:
10:58:27:  <!-- Folding Slots -->
10:58:27:  <slot id='0' type='CPU'>
10:58:27:    <paused v='true'/>
10:58:27:  </slot>
MarcvdM
 
Posts: 14
Joined: Mon Apr 06, 2020 8:02 am

Re: OpenCL: Not detected: clGetDeviceIDs() returned -1

Postby MarcvdM » Mon Apr 06, 2020 12:22 pm

Success, editing the config.xml in /etc did the trick

Code: Select all
[I] marc@marchost /v/l/fahclient> nvidia-smi
Mon Apr  6 13:18:49 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64       Driver Version: 440.64       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro T1000        Off  | 00000000:01:00.0  On |                  N/A |
| N/A   60C    P0    34W /  N/A |    420MiB /  3903MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1586      G   /usr/libexec/Xorg                            129MiB |
|    0      3268      G   ...uest-channel-token=10575655998430285561    44MiB |
|    0      3735      C   ...org/v7/lin/64bit/Core_22.fah/FahCore_22   233MiB |
+-----------------------------------------------------------------------------+


No match for a medim power config running on 4 of the 6 Intel cores though, and screen redraw is really slow now ;-)
update - no, the single GPU (automatically assisted by one CPU core) results in about 10x more "points per day" as a CPU only run
MarcvdM
 
Posts: 14
Joined: Mon Apr 06, 2020 8:02 am

Re: OpenCL: Not detected: clGetDeviceIDs() returned -1

Postby kostuek » Mon Apr 06, 2020 1:23 pm

So, just out of interest, you had multiple vendors active and FAH was confused, is that about correct?
kostuek
 
Posts: 32
Joined: Tue Mar 17, 2020 12:03 pm

Re: OpenCL: Not detected: clGetDeviceIDs() returned -1

Postby MarcvdM » Mon Apr 06, 2020 6:06 pm

yes that is correct: mesa.icd, pocl.icd and nvidia.icd. renamed the first two

the way clinfo works is it goes through the list of vendors. I have searched for a supported way to set a preferred or main OpenCL vendor, could not find any.
MarcvdM
 
Posts: 14
Joined: Mon Apr 06, 2020 8:02 am


Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 1 guest

cron