BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

I'm trying to get a new machine up and folding. CPU slot is working fine. Can't get the GPU slot going. This is a fresh Ubuntu 18.04 install with an RX570 and AMD 18.30 Drivers.

OpenCL seems to be OK: clinfo reports

Code: Select all

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (2671.3)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     Ellesmere
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2 AMD-APP (2671.3)
  Driver Version                                  2671.3
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Board Name (AMD)                         Radeon RX 570 Series
  Device Topology (AMD)                           PCI-E, 23:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               32
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                16
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1250MHz
  Graphics IP (AMD)                               8.0
  Device Partition                                (core)
    Max number of sub-devices                     32
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple              64
  Wavefront width (AMD)                           64
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                2 / 2       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 1 / 1        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              3304976384 (3.078GiB)
  Global free memory (AMD)                        3205948 (3.057GiB)
  Global memory channels (AMD)                    8
  Global memory banks per channel (AMD)           16
  Global memory bank width (AMD)                  256 bytes
  Error Correction support                        No
  Max memory allocation                           2666968268 (2.484GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       2048 bits (256 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        16384 (16KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 2048 images
    Base address alignment for 2D image buffers   256 bytes
    Pitch alignment for 2D image buffers          256 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Local memory syze per CU (AMD)                  65536 (64KiB)
  Local memory banks (AMD)                        32
  Max number of constant args                     8
  Max constant buffer size                        2666968268 (2.484GiB)
  Preferred constant buffer size (AMD)            16384 (16KiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        1540266188953430818ns (Mon Oct 22 22:43:08 2018)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Thread trace supported (AMD)                  Yes
    Number of async queues (AMD)                  2
    Max real-time compute queues (AMD)            0
    Max real-time compute units (AMD)             988541568
    SPIR versions                                 1.2
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                
  Device Extensions                               cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event 

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [AMD]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   Ellesmere
Though fahclient complains:

Code: Select all

03:43:25:      GPU 0: Bus:35 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX 470/480/570/580]
03:43:25:       CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
03:43:25:             libcuda.so: cannot open shared object file: No such file or
03:43:25:             directory
03:43:25:     OpenCL: Not detected: clGetPlatformIDs() returned -1001
Nonetheless, if I create a GPU slot, it will find the harware and complain that it can't determine opencl_index. Setting that manually to 0 seems to make it happy, but all core_21 attempts result in "FahCore returned: BAD_WORK_UNIT (114 = 0x72)"

Code: Select all

*********************** Log Started 2018-10-23T03:43:25Z ***********************
03:43:25:************************* Folding@home Client *************************
03:43:25:    Website: https://foldingathome.org/
03:43:25:  Copyright: (c) 2009-2018 foldingathome.org
03:43:25:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:43:25:       Args: --child --lifeline 1518 /etc/fahclient/config.xml --run-as
03:43:25:             fahclient --pid-file=/var/run/fahclient.pid --daemon
03:43:25:     Config: /etc/fahclient/config.xml
03:43:25:******************************** Build ********************************
03:43:25:    Version: 7.5.1
03:43:25:       Date: May 11 2018
03:43:25:       Time: 19:59:04
03:43:25: Repository: Git
03:43:25:   Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
03:43:25:     Branch: master
03:43:25:   Compiler: GNU 6.3.0 20170516
03:43:25:    Options: -std=gnu++98 -O3 -funroll-loops
03:43:25:   Platform: linux2 4.14.0-3-amd64
03:43:25:       Bits: 64
03:43:25:       Mode: Release
03:43:25:******************************* System ********************************
03:43:25:        CPU: AMD Ryzen 5 2600X Six-Core Processor
03:43:25:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
03:43:25:       CPUs: 12
03:43:25:     Memory: 31.41GiB
03:43:25:Free Memory: 30.37GiB
03:43:25:    Threads: POSIX_THREADS
03:43:25: OS Version: 4.15
03:43:25:Has Battery: false
03:43:25: On Battery: false
03:43:25: UTC Offset: -5
03:43:25:        PID: 1520
03:43:25:        CWD: /var/lib/fahclient
03:43:25:         OS: Linux 4.15.0-36-generic x86_64
03:43:25:    OS Arch: AMD64
03:43:25:       GPUs: 1
03:43:25:      GPU 0: Bus:35 Slot:0 Func:0 AMD:5 Ellesmere XT [Radeon RX 470/480/570/580]
03:43:25:       CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
03:43:25:             libcuda.so: cannot open shared object file: No such file or
03:43:25:             directory
03:43:25:     OpenCL: Not detected: clGetPlatformIDs() returned -1001
03:43:25:***********************************************************************
03:43:25:<config>
03:43:25:  <!-- Client Control -->
03:43:25:  <fold-anon v='true'/>
03:43:25:
03:43:25:  <!-- Network -->
03:43:25:  <proxy v=':8080'/>
03:43:25:
03:43:25:  <!-- User Information -->
03:43:25:  <passkey v='********************************'/>
03:43:25:  <team v='150'/>
03:43:25:  <user v='tchiers'/>
03:43:25:
03:43:25:  <!-- Folding Slots -->
03:43:25:  <slot id='0' type='CPU'/>
03:43:25:</config>
03:43:25:Switching to user fahclient
03:43:25:Trying to access database...
03:43:25:Successfully acquired database lock
03:43:25:Enabled folding slot 00: READY cpu:11
03:43:25:WU00:FS00:Starting
03:43:25:WARNING:WU00:FS00:AS lowered CPUs from 11 to 10
03:43:25:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -np 10
03:43:25:WU00:FS00:Started FahCore on PID 1531
03:43:25:WU00:FS00:Core PID:1535
03:43:25:WU00:FS00:FahCore 0xa7 started
03:43:26:WU00:FS00:0xa7:*********************** Log Started 2018-10-23T03:43:25Z ***********************
03:43:26:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
03:43:26:WU00:FS00:0xa7:       Type: 0xa7
03:43:26:WU00:FS00:0xa7:       Core: Gromacs
03:43:26:WU00:FS00:0xa7:    Website: https://foldingathome.org/
03:43:26:WU00:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
03:43:26:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:43:26:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 1531 -checkpoint 15 -np
03:43:26:WU00:FS00:0xa7:             10
03:43:26:WU00:FS00:0xa7:     Config: <none>
03:43:26:WU00:FS00:0xa7:************************************ Build *************************************
03:43:26:WU00:FS00:0xa7:    Version: 0.0.17
03:43:26:WU00:FS00:0xa7:       Date: Apr 27 2018
03:43:26:WU00:FS00:0xa7:       Time: 19:09:21
03:43:26:WU00:FS00:0xa7: Repository: Git
03:43:26:WU00:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
03:43:26:WU00:FS00:0xa7:     Branch: master
03:43:26:WU00:FS00:0xa7:   Compiler: GNU 6.3.0 20170516
03:43:26:WU00:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops
03:43:26:WU00:FS00:0xa7:   Platform: linux2 4.14.0-3-amd64
03:43:26:WU00:FS00:0xa7:       Bits: 64
03:43:26:WU00:FS00:0xa7:       Mode: Release
03:43:26:WU00:FS00:0xa7:       SIMD: avx_256
03:43:26:WU00:FS00:0xa7:************************************ System ************************************
03:43:26:WU00:FS00:0xa7:        CPU: AMD Ryzen 5 2600X Six-Core Processor
03:43:26:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
03:43:26:WU00:FS00:0xa7:       CPUs: 12
03:43:26:WU00:FS00:0xa7:     Memory: 31.41GiB
03:43:26:WU00:FS00:0xa7:Free Memory: 30.35GiB
03:43:26:WU00:FS00:0xa7:    Threads: POSIX_THREADS
03:43:26:WU00:FS00:0xa7: OS Version: 4.15
03:43:26:WU00:FS00:0xa7:Has Battery: false
03:43:26:WU00:FS00:0xa7: On Battery: false
03:43:26:WU00:FS00:0xa7: UTC Offset: -5
03:43:26:WU00:FS00:0xa7:        PID: 1535
03:43:26:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
03:43:26:WU00:FS00:0xa7:         OS: Linux 4.15.0-36-generic x86_64
03:43:26:WU00:FS00:0xa7:    OS Arch: AMD64
03:43:26:WU00:FS00:0xa7:********************************************************************************
03:43:26:WU00:FS00:0xa7:Project: 13789 (Run 23, Clone 28, Gen 16)
03:43:26:WU00:FS00:0xa7:Unit: 0x000000180002894c5b7ad95e67274e9f
03:43:26:WU00:FS00:0xa7:Digital signatures verified
03:43:26:WU00:FS00:0xa7:Calling: mdrun -s frame16.tpr -o frame16.trr -cpi state.cpt -cpt 15 -nt 10
03:43:26:WU00:FS00:0xa7:Steps: first=40000000 total=2500000
03:43:27:WU00:FS00:0xa7:Completed 18892 out of 2500000 steps (0%)
03:43:52:Adding folding slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580]
03:43:52:Saving configuration to /etc/fahclient/config.xml
03:43:52:<config>
03:43:52:  <!-- Client Control -->
03:43:52:  <fold-anon v='true'/>
03:43:52:
03:43:52:  <!-- Network -->
03:43:52:  <proxy v=':8080'/>
03:43:52:
03:43:52:  <!-- User Information -->
03:43:52:  <passkey v='********************************'/>
03:43:52:  <team v='150'/>
03:43:52:  <user v='tchiers'/>
03:43:52:
03:43:52:  <!-- Folding Slots -->
03:43:52:  <slot id='0' type='CPU'/>
03:43:52:  <slot id='1' type='GPU'/>
03:43:52:</config>
03:43:52:FS00:Shutting core down
03:43:52:WU00:FS00:0xa7:Caught signal SIGINT(2) on PID 1535
03:43:52:WU00:FS00:0xa7:Exiting, please wait. . .
03:43:52:WU01:FS01:Connecting to 65.254.110.245:8080
03:43:52:WU00:FS00:0xa7:Folding@home Core Shutdown: INTERRUPTED
03:43:53:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:43:53:WU01:FS01:Assigned to work server 140.163.4.231
03:43:53:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:43:53:WU01:FS01:Connecting to 140.163.4.231:8080
03:43:53:WU00:FS00:Starting
03:43:53:WARNING:WU00:FS00:Changed SMP threads from 11 to 10 this can cause some work units to fail
03:43:53:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -np 10
03:43:53:WU00:FS00:Started FahCore on PID 2224
03:43:53:WU00:FS00:Core PID:2228
03:43:53:WU00:FS00:FahCore 0xa7 started
03:43:53:WU00:FS00:0xa7:*********************** Log Started 2018-10-23T03:43:53Z ***********************
03:43:53:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
03:43:53:WU00:FS00:0xa7:       Type: 0xa7
03:43:53:WU00:FS00:0xa7:       Core: Gromacs
03:43:53:WU00:FS00:0xa7:    Website: https://foldingathome.org/
03:43:53:WU00:FS00:0xa7:  Copyright: (c) 2009-2018 foldingathome.org
03:43:53:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:43:53:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 2224 -checkpoint 15 -np
03:43:53:WU00:FS00:0xa7:             10
03:43:53:WU00:FS00:0xa7:     Config: <none>
03:43:53:WU00:FS00:0xa7:************************************ Build *************************************
03:43:53:WU00:FS00:0xa7:    Version: 0.0.17
03:43:53:WU00:FS00:0xa7:       Date: Apr 27 2018
03:43:53:WU00:FS00:0xa7:       Time: 19:09:21
03:43:53:WU00:FS00:0xa7: Repository: Git
03:43:53:WU00:FS00:0xa7:   Revision: 21359963583d09ec2063ef946399441c4df4ccd7
03:43:53:WU00:FS00:0xa7:     Branch: master
03:43:53:WU00:FS00:0xa7:   Compiler: GNU 6.3.0 20170516
03:43:53:WU00:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops
03:43:53:WU00:FS00:0xa7:   Platform: linux2 4.14.0-3-amd64
03:43:53:WU00:FS00:0xa7:       Bits: 64
03:43:53:WU00:FS00:0xa7:       Mode: Release
03:43:53:WU00:FS00:0xa7:       SIMD: avx_256
03:43:53:WU00:FS00:0xa7:************************************ System ************************************
03:43:53:WU00:FS00:0xa7:        CPU: AMD Ryzen 5 2600X Six-Core Processor
03:43:53:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
03:43:53:WU00:FS00:0xa7:       CPUs: 12
03:43:53:WU00:FS00:0xa7:     Memory: 31.41GiB
03:43:53:WU00:FS00:0xa7:Free Memory: 29.34GiB
03:43:53:WU00:FS00:0xa7:    Threads: POSIX_THREADS
03:43:53:WU00:FS00:0xa7: OS Version: 4.15
03:43:53:WU00:FS00:0xa7:Has Battery: false
03:43:53:WU00:FS00:0xa7: On Battery: false
03:43:53:WU00:FS00:0xa7: UTC Offset: -5
03:43:53:WU00:FS00:0xa7:        PID: 2228
03:43:53:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
03:43:53:WU00:FS00:0xa7:         OS: Linux 4.15.0-36-generic x86_64
03:43:53:WU00:FS00:0xa7:    OS Arch: AMD64
03:43:53:WU00:FS00:0xa7:********************************************************************************
03:43:53:WU00:FS00:0xa7:Project: 13789 (Run 23, Clone 28, Gen 16)
03:43:53:WU00:FS00:0xa7:Unit: 0x000000180002894c5b7ad95e67274e9f
03:43:53:WU00:FS00:0xa7:Digital signatures verified
03:43:53:WU00:FS00:0xa7:Calling: mdrun -s frame16.tpr -o frame16.trr -cpi state.cpt -cpt 15 -nt 10
03:43:53:WU00:FS00:0xa7:Steps: first=40000000 total=2500000
03:43:54:WU01:FS01:Downloading 33.30MiB
03:43:55:WU00:FS00:0xa7:Completed 21532 out of 2500000 steps (0%)
03:44:00:WU01:FS01:Download 7.51%
03:44:06:WU01:FS01:Download 35.66%
03:44:12:WU01:FS01:Download 91.40%
03:44:12:WU01:FS01:Download complete
03:44:12:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11730 run:4 clone:105 gen:18 core:0x21 unit:0x000000148ca304e75bcbe539040ccb8f
03:44:12:WU01:FS01:Downloading core from http://cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah
03:44:12:WU01:FS01:Connecting to cores.foldingathome.org:80
03:44:13:WU01:FS01:FahCore 21: Downloading 3.23MiB
03:44:13:WU01:FS01:FahCore 21: Download complete
03:44:13:WU01:FS01:Valid core signature
03:44:13:WU01:FS01:Unpacked 7.94MiB to cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21
03:44:13:WU01:FS01:Starting
03:44:13:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
03:44:14:WU01:FS01:Starting
03:44:14:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
03:44:26:Saving configuration to /etc/fahclient/config.xml
03:44:26:<config>
03:44:26:  <!-- Client Control -->
03:44:26:  <fold-anon v='true'/>
03:44:26:
03:44:26:  <!-- Network -->
03:44:26:  <proxy v=':8080'/>
03:44:26:
03:44:26:  <!-- User Information -->
03:44:26:  <passkey v='********************************'/>
03:44:26:  <team v='150'/>
03:44:26:  <user v='tchiers'/>
03:44:26:
03:44:26:  <!-- Folding Slots -->
03:44:26:  <slot id='0' type='CPU'/>
03:44:26:  <slot id='1' type='GPU'/>
03:44:26:</config>
03:44:27:WU00:FS00:0xa7:Completed 25000 out of 2500000 steps (1%)
03:45:13:Saving configuration to /etc/fahclient/config.xml
03:45:13:<config>
03:45:13:  <!-- Client Control -->
03:45:13:  <fold-anon v='true'/>
03:45:13:
03:45:13:  <!-- Network -->
03:45:13:  <proxy v=':8080'/>
03:45:13:
03:45:13:  <!-- User Information -->
03:45:13:  <passkey v='********************************'/>
03:45:13:  <team v='150'/>
03:45:13:  <user v='tchiers'/>
03:45:13:
03:45:13:  <!-- Folding Slots -->
03:45:13:  <slot id='0' type='CPU'/>
03:45:13:  <slot id='1' type='GPU'>
03:45:13:    <opencl-index v='0'/>
03:45:13:  </slot>
03:45:13:</config>
03:45:14:WU01:FS01:Starting
03:45:14:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:45:14:WU01:FS01:Started FahCore on PID 2313
03:45:14:WU01:FS01:Core PID:2317
03:45:14:WU01:FS01:FahCore 0x21 started
03:45:14:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:45:14:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11730 run:4 clone:105 gen:18 core:0x21 unit:0x000000148ca304e75bcbe539040ccb8f
03:45:14:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:45:14:WU01:FS01:Connecting to 140.163.4.231:8080
03:45:14:WU01:FS01:Upload complete
03:45:14:WU01:FS01:Server responded WORK_ACK (400)
03:45:14:WU01:FS01:Cleaning up
03:45:15:WU01:FS01:Connecting to 65.254.110.245:8080
03:45:15:WU01:FS01:Assigned to work server 140.163.4.231
03:45:15:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:45:15:WU01:FS01:Connecting to 140.163.4.231:8080
03:45:16:WU01:FS01:Downloading 33.32MiB
03:45:22:WU01:FS01:Download 38.27%
03:45:27:Saving configuration to /etc/fahclient/config.xml
03:45:27:<config>
03:45:27:  <!-- Client Control -->
03:45:27:  <fold-anon v='true'/>
03:45:27:
03:45:27:  <!-- Network -->
03:45:27:  <proxy v=':8080'/>
03:45:27:
03:45:27:  <!-- User Information -->
03:45:27:  <passkey v='********************************'/>
03:45:27:  <team v='150'/>
03:45:27:  <user v='tchiers'/>
03:45:27:
03:45:27:  <!-- Folding Slots -->
03:45:27:  <slot id='0' type='CPU'/>
03:45:27:  <slot id='1' type='GPU'>
03:45:27:    <opencl-index v='0'/>
03:45:27:  </slot>
03:45:27:</config>
03:45:28:WU01:FS01:Download 81.78%
03:45:29:WU01:FS01:Download complete
03:45:30:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11730 run:5 clone:354 gen:10 core:0x21 unit:0x0000000b8ca304e75bcd1400b9f7b56e
03:45:30:WU01:FS01:Starting
03:45:30:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:45:30:WU01:FS01:Started FahCore on PID 2328
03:45:30:WU01:FS01:Core PID:2332
03:45:30:WU01:FS01:FahCore 0x21 started
03:45:30:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:45:30:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11730 run:5 clone:354 gen:10 core:0x21 unit:0x0000000b8ca304e75bcd1400b9f7b56e
03:45:30:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:45:30:WU01:FS01:Connecting to 140.163.4.231:8080
03:45:30:WU02:FS01:Connecting to 65.254.110.245:8080
03:45:31:WU01:FS01:Upload complete
03:45:31:WU01:FS01:Server responded WORK_ACK (400)
03:45:31:WU01:FS01:Cleaning up
03:45:31:WU02:FS01:Assigned to work server 140.163.4.231
03:45:31:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:45:31:WU02:FS01:Connecting to 140.163.4.231:8080
03:45:31:WU02:FS01:Downloading 33.31MiB
03:45:37:WU02:FS01:Download 41.10%
03:45:43:WU02:FS01:Download 92.89%
03:45:44:WU02:FS01:Download complete
03:45:44:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11730 run:2 clone:314 gen:6 core:0x21 unit:0x000000088ca304e75bcd133e2b4f0830
03:45:44:WU02:FS01:Starting
03:45:44:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:45:44:WU02:FS01:Started FahCore on PID 2336
03:45:44:WU02:FS01:Core PID:2340
03:45:44:WU02:FS01:FahCore 0x21 started
03:45:44:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:45:44:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:11730 run:2 clone:314 gen:6 core:0x21 unit:0x000000088ca304e75bcd133e2b4f0830
03:45:44:WU02:FS01:Uploading 5.50KiB to 140.163.4.231
03:45:44:WU02:FS01:Connecting to 140.163.4.231:8080
03:45:45:WU02:FS01:Upload complete
03:45:45:WU02:FS01:Server responded WORK_ACK (400)
03:45:45:WU02:FS01:Cleaning up
03:45:45:WU01:FS01:Connecting to 65.254.110.245:8080
03:45:45:WU01:FS01:Assigned to work server 140.163.4.231
03:45:45:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:45:45:WU01:FS01:Connecting to 140.163.4.231:8080
03:45:46:WU01:FS01:Downloading 33.30MiB
03:45:52:WU01:FS01:Download 9.20%
03:45:58:WU01:FS01:Download 45.98%
03:46:04:WU01:FS01:Download 92.90%
03:46:04:WU01:FS01:Download complete
03:46:04:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11730 run:4 clone:646 gen:10 core:0x21 unit:0x0000000f8ca304e75bcd373c1b684dd8
03:46:04:WU01:FS01:Starting
03:46:04:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:46:04:WU01:FS01:Started FahCore on PID 2749
03:46:04:WU01:FS01:Core PID:2753
03:46:04:WU01:FS01:FahCore 0x21 started
03:46:05:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:46:05:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11730 run:4 clone:646 gen:10 core:0x21 unit:0x0000000f8ca304e75bcd373c1b684dd8
03:46:05:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:46:05:WU01:FS01:Connecting to 140.163.4.231:8080
03:46:05:WU01:FS01:Upload complete
03:46:05:WU01:FS01:Server responded WORK_ACK (400)
03:46:05:WU01:FS01:Cleaning up
03:46:05:WU02:FS01:Connecting to 65.254.110.245:8080
03:46:06:WU02:FS01:Assigned to work server 140.163.4.231
03:46:06:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:46:06:WU02:FS01:Connecting to 140.163.4.231:8080
03:46:06:WU02:FS01:Downloading 33.32MiB
03:46:12:WU02:FS01:Download 45.21%
03:46:17:WU02:FS01:Download complete
03:46:18:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11730 run:5 clone:95 gen:14 core:0x21 unit:0x000000128ca304e75bcbe54b67c14567
03:46:18:WU02:FS01:Starting
03:46:18:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:46:18:WU02:FS01:Started FahCore on PID 2772
03:46:18:WU02:FS01:Core PID:2776
03:46:18:WU02:FS01:FahCore 0x21 started
03:46:18:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:46:18:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:11730 run:5 clone:95 gen:14 core:0x21 unit:0x000000128ca304e75bcbe54b67c14567
03:46:18:WU02:FS01:Uploading 5.50KiB to 140.163.4.231
03:46:18:WU02:FS01:Connecting to 140.163.4.231:8080
03:46:18:WU02:FS01:Upload complete
03:46:18:WU02:FS01:Server responded WORK_ACK (400)
03:46:18:WU02:FS01:Cleaning up
03:46:18:WU01:FS01:Connecting to 65.254.110.245:8080
03:46:19:WU01:FS01:Assigned to work server 140.163.4.231
03:46:19:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:46:19:WU01:FS01:Connecting to 140.163.4.231:8080
03:46:19:WU01:FS01:Downloading 14.57MiB
03:46:24:WU01:FS01:Download complete
03:46:24:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11728 run:0 clone:1346 gen:217 core:0x21 unit:0x0000010b8ca304e75ba0324a41f2083f
03:46:24:WU01:FS01:Starting
03:46:24:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:46:24:WU01:FS01:Started FahCore on PID 2779
03:46:24:WU01:FS01:Core PID:2783
03:46:24:WU01:FS01:FahCore 0x21 started
03:46:25:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:46:25:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11728 run:0 clone:1346 gen:217 core:0x21 unit:0x0000010b8ca304e75ba0324a41f2083f
03:46:25:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:46:25:WU01:FS01:Connecting to 140.163.4.231:8080
03:46:25:WU01:FS01:Upload complete
03:46:25:WU01:FS01:Server responded WORK_ACK (400)
03:46:25:WU01:FS01:Cleaning up
03:46:25:WU02:FS01:Connecting to 65.254.110.245:8080
03:46:26:WU02:FS01:Assigned to work server 140.163.4.231
03:46:26:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:46:26:WU02:FS01:Connecting to 140.163.4.231:8080
03:46:26:WU02:FS01:Downloading 14.57MiB
03:46:32:WU02:FS01:Download 18.02%
03:46:38:WU02:FS01:Download 59.21%
03:46:40:WU02:FS01:Download complete
03:46:40:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11728 run:0 clone:661 gen:189 core:0x21 unit:0x000000e58ca304e75b998078fbe5db47
03:46:40:WU02:FS01:Starting
03:46:40:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:46:40:WU02:FS01:Started FahCore on PID 2787
03:46:40:WU02:FS01:Core PID:2791
03:46:40:WU02:FS01:FahCore 0x21 started
03:46:40:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:46:40:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:11728 run:0 clone:661 gen:189 core:0x21 unit:0x000000e58ca304e75b998078fbe5db47
03:46:40:WU02:FS01:Uploading 5.50KiB to 140.163.4.231
03:46:40:WU02:FS01:Connecting to 140.163.4.231:8080
03:46:40:WU02:FS01:Upload complete
03:46:40:WU02:FS01:Server responded WORK_ACK (400)
03:46:40:WU02:FS01:Cleaning up
03:46:40:WU01:FS01:Connecting to 65.254.110.245:8080
03:46:41:WU01:FS01:Assigned to work server 140.163.4.231
03:46:41:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:46:41:WU01:FS01:Connecting to 140.163.4.231:8080
03:46:41:WU01:FS01:Downloading 33.22MiB
03:46:47:WU01:FS01:Download 25.59%
03:46:53:WU01:FS01:Download 69.42%
03:46:57:WU01:FS01:Download complete
03:46:57:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11730 run:0 clone:63 gen:15 core:0x21 unit:0x000000118ca304e75bcbe4d32247ee3c
03:46:57:WU01:FS01:Starting
03:46:57:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:46:57:WU01:FS01:Started FahCore on PID 2794
03:46:57:WU01:FS01:Core PID:2798
03:46:57:WU01:FS01:FahCore 0x21 started
03:46:57:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:46:57:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11730 run:0 clone:63 gen:15 core:0x21 unit:0x000000118ca304e75bcbe4d32247ee3c
03:46:57:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:46:57:WU01:FS01:Connecting to 140.163.4.231:8080
03:46:58:WU01:FS01:Upload complete
03:46:58:WU01:FS01:Server responded WORK_ACK (400)
03:46:58:WU01:FS01:Cleaning up
03:46:58:WU02:FS01:Connecting to 65.254.110.245:8080
03:46:58:WU02:FS01:Assigned to work server 140.163.4.231
03:46:58:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:46:58:WU02:FS01:Connecting to 140.163.4.231:8080
03:46:59:WU02:FS01:Downloading 20.08MiB
03:47:05:WU02:FS01:Download 82.81%
03:47:05:WU02:FS01:Download complete
03:47:05:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11719 run:0 clone:1188 gen:134 core:0x21 unit:0x000000b78ca304e75b8990850522f53e
03:47:05:WU02:FS01:Starting
03:47:05:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:47:05:WU02:FS01:Started FahCore on PID 2801
03:47:05:WU02:FS01:Core PID:2805
03:47:05:WU02:FS01:FahCore 0x21 started
03:47:06:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:47:06:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:11719 run:0 clone:1188 gen:134 core:0x21 unit:0x000000b78ca304e75b8990850522f53e
03:47:06:WU02:FS01:Uploading 5.50KiB to 140.163.4.231
03:47:06:WU02:FS01:Connecting to 140.163.4.231:8080
03:47:06:WU01:FS01:Connecting to 65.254.110.245:8080
03:47:06:WU02:FS01:Upload complete
03:47:06:WU02:FS01:Server responded WORK_ACK (400)
03:47:06:WU02:FS01:Cleaning up
03:47:07:WU01:FS01:Assigned to work server 140.163.4.231
03:47:07:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:Ellesmere XT [Radeon RX 470/480/570/580] from 140.163.4.231
03:47:07:WU01:FS01:Connecting to 140.163.4.231:8080
03:47:07:WU01:FS01:Downloading 19.84MiB
03:47:13:WU01:FS01:Download 12.92%
03:47:19:WU01:FS01:Download 71.20%
03:47:20:WU01:FS01:Download complete
03:47:20:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11726 run:0 clone:60 gen:164 core:0x21 unit:0x000000cb8ca304e75b86cd6a2cfd5391
03:47:20:WU01:FS01:Starting
03:47:20:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/ATI/R600/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 705 -lifeline 1520 -checkpoint 15 -gpu-vendor amd -opencl-device 0 -gpu 0
03:47:20:WU01:FS01:Started FahCore on PID 2808
03:47:20:WU01:FS01:Core PID:2812
03:47:20:WU01:FS01:FahCore 0x21 started
03:47:21:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
03:47:21:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:11726 run:0 clone:60 gen:164 core:0x21 unit:0x000000cb8ca304e75b86cd6a2cfd5391
03:47:21:WU01:FS01:Uploading 5.50KiB to 140.163.4.231
03:47:21:WU01:FS01:Connecting to 140.163.4.231:8080
03:47:21:WU01:FS01:Upload complete
03:47:21:WU01:FS01:Server responded WORK_ACK (400)
03:47:21:WU01:FS01:Cleaning up
I have tried a few things suggested from searching around here (reinstalling, sudo apt install ocl-icd-opencl-dev) to no avail. Not sure where to go from here.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by bruce »

Unfortunately, "BAD_WORK_UNIT (114 = 0x72)" doesn't really indicate the cause. The first thing to check for is whether the GPU is producing accurate calculations -- i.e.- not overclocked/overheated, etc. though there are a lot of other possibilities.

In the case of project:11728 run:0 clone:661 gen:189, after your FAULTY result was uploaded, it was reassigned and successfully completed by someone else, so the WU, itself really isn't faulty. The other two have probably been reassigned, but they have not yet been returned.

What does FAHBench tell you?
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

Thanks Bruce, I has assumed the problem was on my end and not really a bad WU.

FAHBench says

Code: Select all

FAHBench Simulation
-------------------
Plugin directory: "/home/todd/Downloads/FAHBench-2.3.2-Linux/lib/openmm"
Work unit: dhfr
WU Name: Dihydrofolate reductase
WU Description: A common system for benchmarking molecular dynamics
System XML: /home/todd/Downloads/FAHBench-2.3.2-Linux/share/fahbench/workunits/dhfr/system.xml
Integrator XML: /home/todd/Downloads/FAHBench-2.3.2-Linux/share/fahbench/workunits/dhfr/integrator.xml
State XML: /home/todd/Downloads/FAHBench-2.3.2-Linux/share/fahbench/workunits/dhfr/state.xml
Step chunk: 40
Device ID 0; Platform OpenCL; Platform ID 0
Run length: 60s

Loading plugins from plugin directory
Number of registered plugins: 3
Deserializing input files: system
Deserializing input files: state
Deserializing input files: integrator
Creating context (may take several minutes)
Checking accuracy against reference code
Creating reference context (may take several minutes)
Comparing forces and energy
Starting Benchmark
                                                                                
Benchmarking finished
Final score:   63.0839
Scaled score:  63.0839 (23558 atoms)
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

But this got me thinking. What about sudo -u fahclient ./FAHBench-cmd?

Code: Select all

Something went wrong:
Error initializing context: clGetPlatformIDs (-1001)
Now we're getting somewhere...
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

User fahclient can't access OpenCL resources, even after trying this trick:
https://wiki.tiker.net/OpenCLHowTo#Runn ... thout_sudo

fahclient can run xclock, so I don't think it's an X permissions issue. I'm hoping someone can point me in right direction on what to look at next.
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

adduser fahclient video

fahclient can now run clinfo and FAHBench... but still can't fold on the GPU.
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

And if I kill the FAHClient daemon process and re-start it with

sudo -u fahclient /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid

We are in business! So something about the environment at boot time is not right when the daemon is started and there is no OpenCL love until later.

I guess I'd better table this and get some actual work done today.
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

Back to looking at this.

FAHClient run as daemon can't access open CL - the "clGetPlatformIDs returned -1001" error.

But if I stop the daemon and do "sudo -u fahclient FAHClient ..." all works fine.

Any ideas on this?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by bruce »

I vaguely remember that there's an environment variable that has to be set so OpenCL can be found which apparently isn't being set in some cases. Maybe there's a way to have that variable initialized earlier -- before the daemon starts.
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

Any idea what the environment variable might be?

sudo -u fahclient env -i FAHClient ... works just fine
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

OK, I think I've finally figured it out. For whatever reason, systemd (or at least the version on Ubuntu 18.04) does not get along with FAHClient's --run-as option.

Instead of the auto-imported FAHClient.service, I wrote a fah.service

Code: Select all

[Unit]
Description=Folding@Home for Systemd
After=remote-fs.target
After=network-online.target
After=graphical.target
Wants=network-online.target graphical.target


[Service]
Type=simple
WorkingDirectory=/var/lib/fahclient
User=fahclient
ExecStart=/usr/bin/FAHClient /etc/fahclient/config.xml --pid-file=/var/run/fahclient.pid

[Install]
WantedBy=graphical.target
Then disabled the former and enabled the latter.

Now FAHClient starts automatically at boot with access to openCL for GPU folding. I hope this is useful for anyone else who runs into this problem.
Last edited by tchiers on Wed Jan 23, 2019 2:23 pm, edited 1 time in total.
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by JimF »

Very interesting. I currently have an RX 570 on order, though not necessarily for Folding.
What sort of PPD do you get on Ubuntu?
tchiers
Posts: 23
Joined: Tue Oct 23, 2018 4:23 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by tchiers »

~350K PPD for just the GPU.
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by JimF »

Thanks, that is nice. I will be using mine mainly on Win7, but could put it on Ubuntu if the need arises. It gives me an option I didn't know I had.
csallen1204
Posts: 1
Joined: Mon Nov 26, 2018 12:06 am

Re: BAD_WORK_UNIT on fresh Ubuntu 18.04 & RX570

Post by csallen1204 »

I had this issue on both 18.04 and 16.04. Ultimately I fixed it by modifying the /etc/init.d/FAHClient script, changing the user from fahclient to root. In the Linux amdgpu documentation it says that any OpenCL user needs to be added to the video group in /etc/group to grant access but even after doing that I still got errors that the client couldn't get access to the OpenCL devices.

Has anyone seen this issue before and was able to tweak the fahclient user's privileges to fix this without resorting to running the fahclient as root?
Post Reply