troubles folding with 5700xt on archlinux

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
bboutkov
Posts: 2
Joined: Sat Jul 25, 2020 2:34 am

troubles folding with 5700xt on archlinux

Post by bboutkov »

Hello folders,

As topic says having trouble getting set up with the new amd gpu, CPU works fine. Searching the forums leads me to believe that this card is supported at this time, but its not quite clear to me if its in beta, or fully supported now. I have tried setting the gpu and opencl index to 0 instead of the default -1, and also have been experimenting with some of the client-type settings I have spotted in various forum posts - namely I tried both "beta" and "advanced" and neither were successful but not sure if that setting is still needed. currently at idx 0/0 / clienttype beta I get error a lot of bad work unit errors on the GPU, but also : 02:45:13:WU01:FS01:0x22:ERROR:126: Bad platformId size, which I suspect is the main problem.

additional info : default mesa drivers, opencl-amd, and (please let me know if you need to know anything else) :

Code: Select all

02:45:12:WU01:FS01:FahCore 0x22 started
02:45:12:WU01:FS01:0x22:*********************** Log Started 2020-07-25T02:45:12Z ***********************
02:45:12:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
02:45:12:WU01:FS01:0x22:       Core: Core22
02:45:12:WU01:FS01:0x22:       Type: 0x22
02:45:12:WU01:FS01:0x22:    Version: 0.0.11
02:45:12:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
02:45:12:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
02:45:12:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
02:45:12:WU01:FS01:0x22:       Date: Jun 27 2020
02:45:12:WU01:FS01:0x22:       Time: 22:50:00
02:45:12:WU01:FS01:0x22:   Revision: cfc2940c5dd1aa80f60daa6e28d4a2a417f74edb
02:45:12:WU01:FS01:0x22:     Branch: core22-0.0.11
02:45:12:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
02:45:12:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
02:45:12:WU01:FS01:0x22:             -funroll-loops
02:45:12:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
02:45:12:WU01:FS01:0x22:       Bits: 64
02:45:12:WU01:FS01:0x22:       Mode: Release
02:45:12:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
02:45:12:WU01:FS01:0x22:             <peastman@stanford.edu>
02:45:12:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 44495 -checkpoint 30
02:45:12:WU01:FS01:0x22:             -gpu-vendor amd -opencl-device 0 -gpu 0
02:45:12:WU01:FS01:0x22:************************************ libFAH ************************************
02:45:12:WU01:FS01:0x22:       Date: Jun 27 2020
02:45:12:WU01:FS01:0x22:       Time: 22:11:04
02:45:12:WU01:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
02:45:12:WU01:FS01:0x22:     Branch: HEAD
02:45:12:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
02:45:12:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
02:45:12:WU01:FS01:0x22:             -funroll-loops
02:45:12:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
02:45:12:WU01:FS01:0x22:       Bits: 64
02:45:12:WU01:FS01:0x22:       Mode: Release
02:45:12:WU01:FS01:0x22:************************************ CBang *************************************
02:45:12:WU01:FS01:0x22:       Date: Jun 27 2020
02:45:12:WU01:FS01:0x22:       Time: 22:10:11
02:45:12:WU01:FS01:0x22:   Revision: f8529962055b0e7bde23e429f5072ff758089dee
02:45:12:WU01:FS01:0x22:     Branch: HEAD
02:45:12:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
02:45:12:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
02:45:12:WU01:FS01:0x22:             -funroll-loops -fPIC
02:45:12:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
02:45:12:WU01:FS01:0x22:       Bits: 64
02:45:12:WU01:FS01:0x22:       Mode: Release
02:45:12:WU01:FS01:0x22:************************************ System ************************************
02:45:12:WU01:FS01:0x22:        CPU: AMD Ryzen 7 1800X Eight-Core Processor
02:45:12:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
02:45:12:WU01:FS01:0x22:       CPUs: 16
02:45:12:WU01:FS01:0x22:     Memory: 62.81GiB
02:45:12:WU01:FS01:0x22:Free Memory: 53.01GiB
02:45:12:WU01:FS01:0x22:    Threads: POSIX_THREADS
02:45:12:WU01:FS01:0x22: OS Version: 5.7
02:45:12:WU01:FS01:0x22:Has Battery: false
02:45:12:WU01:FS01:0x22: On Battery: false
02:45:12:WU01:FS01:0x22: UTC Offset: -4
02:45:12:WU01:FS01:0x22:        PID: 44499
02:45:12:WU01:FS01:0x22:        CWD: /var/lib/private/fah/work
02:45:12:WU01:FS01:0x22:********************************************************************************
02:45:12:WU01:FS01:0x22:Project: 16600 (Run 0, Clone 692, Gen 128)
02:45:12:WU01:FS01:0x22:Unit: 0x0000008d8f59f36f5ec36911ee8b859f
02:45:12:WU01:FS01:0x22:Reading tar file core.xml
02:45:12:WU01:FS01:0x22:Reading tar file integrator.xml
02:45:12:WU01:FS01:0x22:Reading tar file state.xml
02:45:13:WU01:FS01:0x22:Reading tar file system.xml
02:45:13:WU01:FS01:0x22:Digital signatures verified
02:45:13:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
02:45:13:WU01:FS01:0x22:Version 0.0.11
02:45:13:WU01:FS01:0x22:  Checkpoint write interval: 25000 steps (5%) [20 total]
02:45:13:WU01:FS01:0x22:  JSON viewer frame write interval: 5000 steps (1%) [100 total]
02:45:13:WU01:FS01:0x22:  XTC frame write interval: 20000 steps (4%) [25 total]
02:45:13:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
02:45:13:WU01:FS01:0x22:ERROR:126: Bad platformId size.
02:45:13:WU01:FS01:0x22:Saving result file ../logfile_01.txt
02:45:13:WU01:FS01:0x22:Saving result file science.log
02:45:13:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
Thanks for any help you can provide!
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: troubles folding with 5700xt on archlinux

Post by bruce »

GPUs are not placed in "beta" Navi GPUs are fully supported.

When a new project is developed, it is expected to spend some time in "beta" until it has a history of stability. Then it is moved to 'Advanced" which is essentially a release candidate status and then to Full FAH (released to everybody).

The Bad platformId size. error suggests that maybe you do not have the AMD drivers or the OpenCL support properly installed. I'm not a Linux expert, but somebody will come by soon and help you.
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: troubles folding with 5700xt on archlinux

Post by Joe_H »

The Mesa drivers were mentioned, they do not work for folding. You will need to either use the AMD proprietary drivers or the ROCm drivers. From what some have posted, the ROCm drivers take a bit of work to get installed and configured properly.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bboutkov
Posts: 2
Joined: Sat Jul 25, 2020 2:34 am

Re: troubles folding with 5700xt on archlinux

Post by bboutkov »

Thanks for the initial info bruce and joe_h.

I guess the overall situation is still poor. I tried using the opencl-amd proprietary drivers and immediately ran into errors such as "Forces are blowing up" and ":ERROR:Force RMSE error". After that I spent a few hours compiling the rocm-opencl-runtime drivers. This seemed promising as my first 2 work units seemed to complete - but then I started getting lots of the forces blowing up / RMSE errors again, and havent been able to get another successful WU after the errors started - even after restarting the client or readding the GPU slot config.

I wish I could find the logs but /var/lib/fahclient as written on your website is empty, but I guess this maybe was my fault as the successful ones were cleared via FAHControl while trying to diagnose the second round of errors. You mention that the mesa drivers dont work - but they were installed for those first two successful work units - and I guess I dont understand how the openGL drivers would affect the opencl drivers - they seem to be different things to me, and AFAICT the opengl mesa drivers wouldnt be used for compute, and only for display purposes - but please tell me if Im being slow/misunderstanding something here.

If you can point me to what someone did to get this card running with "a bit of work" Id love to try further, but as it stands - there seems to be something fundamentally amuck with the support of this card still.

Thanks again in advanced for any further info you can provide.
Post Reply