FahCore returned: FAILED_3

Moderators: Site Moderators, FAHC Science Team

Post Reply
frankkkkk
Posts: 3
Joined: Sun Mar 05, 2023 7:16 pm

FahCore returned: FAILED_3

Post by frankkkkk »

Hello everyone,
I'm finally setting up fah on my kubernetes cluster. Everything works kinda ok with the `linuxserver/foldingathome` image, but as some of my servers have GPUs, I have to use the `foldingathome/fah-gpu` image. However, I can't get it to run.

Here are the full logs:

Code: Select all

23:21:40:Read GPUs.txt
23:21:40:******************************* libFAH ********************************
23:21:40:           Date: Oct 20 2020
23:21:40:           Time: 20:36:39
23:21:40:       Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
23:21:40:         Branch: master
23:21:40:       Compiler: GNU 8.3.0
23:21:40:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:21:40:                 -fdata-sections -O3 -funroll-loops -fno-pie
23:21:40:       Platform: linux2 5.8.0-1-amd64
23:21:40:           Bits: 64
23:21:40:           Mode: Release
23:21:40:****************************** FAHClient ******************************
23:21:40:        Version: 7.6.21
23:21:40:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:21:40:      Copyright: 2020 foldingathome.org
23:21:40:       Homepage: https://foldingathome.org/
23:21:40:           Date: Oct 20 2020
23:21:40:           Time: 20:39:00
23:21:40:       Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
23:21:40:         Branch: master
23:21:40:       Compiler: GNU 8.3.0
23:21:40:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:21:40:                 -fdata-sections -O3 -funroll-loops -fno-pie
23:21:40:       Platform: linux2 5.8.0-1-amd64
23:21:40:           Bits: 64
23:21:40:           Mode: Release
23:21:40:           Args: --chdir /fah
23:21:40:         Config: /fah/config.xml
23:21:40:******************************** CBang ********************************
23:21:40:           Date: Oct 20 2020
23:21:40:           Time: 18:37:59
23:21:40:       Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
23:21:40:         Branch: master
23:21:40:       Compiler: GNU 8.3.0
23:21:40:        Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:21:40:                 -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
23:21:40:       Platform: linux2 5.8.0-1-amd64
23:21:40:           Bits: 64
23:21:40:           Mode: Release
23:21:40:******************************* System ********************************
23:21:40:            CPU: Intel(R) Xeon(R) CPU E31265L @ 2.40GHz
23:21:40:         CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
23:21:40:           CPUs: 8
23:21:40:         Memory: 23.45GiB
23:21:40:    Free Memory: 22.45GiB
23:21:40:        Threads: POSIX_THREADS
23:21:40:     OS Version: 5.15
23:21:40:    Has Battery: false
23:21:40:     On Battery: false
23:21:40:     UTC Offset: 0
23:21:40:            PID: 23
23:21:40:            CWD: /fah
23:21:40:             OS: Linux 5.15.0-67-generic x86_64
23:21:40:        OS Arch: AMD64
23:21:40:           GPUs: 1
23:21:40:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
23:21:40:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:12.0
23:21:40:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:3.0 Driver:525.85
23:21:40:***********************************************************************
23:21:40:<config>
23:21:40:  <!-- Client Control -->
23:21:40:  <exit-when-done v='true'/>
23:21:40:
23:21:40:  <!-- GUI -->
23:21:40:  <gui-enabled v='false'/>
23:21:40:
23:21:40:  <!-- Slot Control -->
23:21:40:  <power v='full'/>
23:21:40:
23:21:40:  <!-- User Information -->
23:21:40:  <passkey v='*****'/>
23:21:40:  <user v='frankkkkk'/>
23:21:40:
23:21:40:  <!-- Folding Slots -->
23:21:40:  <slot id='0' type='GPU'/>
23:21:40:  <slot id='1' type='CPU'/>
23:21:40:</config>
23:21:40:Trying to access database...
23:21:40:Successfully acquired database lock
23:21:40:FS00:Initialized folding slot 00: gpu:1:0 GP104 [GeForce GTX 1070] 6463
23:21:40:FS01:Initialized folding slot 01: cpu:7
23:21:40:WU00:FS00:Downloading core from http://cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah
23:21:40:WU00:FS00:Connecting to cores.foldingathome.org:80
23:21:40:WU02:FS01:Starting
23:21:40:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:21:40:WU02:FS01:Started FahCore on PID 33
23:21:40:WU02:FS01:Core PID:37
23:21:40:WU02:FS01:FahCore 0xa8 started
23:21:41:WARNING:WU02:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:21:41:WARNING:WU02:FS01:Too many errors, failing
23:21:41:WU02:FS01:Sending unit results: id:02 state:SEND error:FAILED project:19003 run:23 clone:11 gen:2 core:0xa8 unit:0x0000000b0000000200004a3b00000017
23:21:41:WU02:FS01:Connecting to 128.174.73.74:8080
23:21:41:WU01:FS01:Connecting to assign1.foldingathome.org:80
23:21:41:WU00:FS00:FahCore 22: Downloading 166.22MiB
23:21:42:WU02:FS01:Server responded WORK_ACK (400)
23:21:42:WU02:FS01:Cleaning up
23:21:42:WU01:FS01:Assigned to work server 129.32.209.205
23:21:42:WU01:FS01:Requesting new work unit for slot 01: cpu:7 from 129.32.209.205
23:21:42:WU01:FS01:Connecting to 129.32.209.205:8080
23:21:43:WU01:FS01:Downloading 1.47MiB
23:21:45:WU01:FS01:Download complete
23:21:45:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:16996 run:13 clone:154 gen:191 core:0xa8 unit:0x0000009a000000bf000042640000000d
23:21:45:WU01:FS01:Starting
23:21:45:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:21:45:WU01:FS01:Started FahCore on PID 38
23:21:45:WU01:FS01:Core PID:42
23:21:45:WU01:FS01:FahCore 0xa8 started
23:21:45:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:21:45:WU01:FS01:Starting
23:21:45:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:21:45:WU01:FS01:Started FahCore on PID 43
23:21:45:WU01:FS01:Core PID:47
23:21:45:WU01:FS01:FahCore 0xa8 started
23:21:46:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:21:47:WU00:FS00:FahCore 22: 6.24%
23:21:53:WU00:FS00:FahCore 22: 13.80%
23:21:59:WU00:FS00:FahCore 22: 26.32%
23:22:05:WU00:FS00:FahCore 22: 36.10%
23:22:11:WU00:FS00:FahCore 22: 43.84%
23:22:17:WU00:FS00:FahCore 22: 50.72%
23:22:23:WU00:FS00:FahCore 22: 56.74%
23:22:29:WU00:FS00:FahCore 22: 61.59%
23:22:35:WU00:FS00:FahCore 22: 66.14%
23:22:41:WU00:FS00:FahCore 22: 70.16%
23:22:41:Saving configuration to config.xml
23:22:41:<config>
23:22:41:  <!-- Client Control -->
23:22:41:  <exit-when-done v='true'/>
23:22:41:
23:22:41:  <!-- GUI -->
23:22:41:  <gui-enabled v='false'/>
23:22:41:
23:22:41:  <!-- Slot Control -->
23:22:41:  <power v='full'/>
23:22:41:
23:22:41:  <!-- User Information -->
23:22:41:  <passkey v='*****'/>
23:22:41:  <user v='frankkkkk'/>
23:22:41:
23:22:41:  <!-- Folding Slots -->
23:22:41:  <slot id='0' type='GPU'>
23:22:41:    <pci-bus v='1'/>
23:22:41:    <pci-slot v='0'/>
23:22:41:  </slot>
23:22:41:  <slot id='1' type='CPU'/>
23:22:41:</config>
23:22:45:WU01:FS01:Starting
23:22:45:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:22:45:WU01:FS01:Started FahCore on PID 48
23:22:45:WU01:FS01:Core PID:52
23:22:45:WU01:FS01:FahCore 0xa8 started
23:22:46:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:22:47:WU00:FS00:FahCore 22: 74.75%
23:22:53:WU00:FS00:FahCore 22: 79.98%
23:22:59:WU00:FS00:FahCore 22: 85.24%
23:23:05:WU00:FS00:FahCore 22: 90.96%
23:23:11:WU00:FS00:FahCore 22: 99.61%
23:23:11:WU00:FS00:FahCore 22: Download complete
23:23:11:WU00:FS00:Valid core signature
23:23:11:WU00:FS00:Unpacked 4.84MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22
23:23:11:WU00:FS00:Unpacked 181.31MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libcufft.so.10
23:23:11:WU00:FS00:Unpacked 2.44MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libfftw3f.so.3
23:23:11:WU00:FS00:Unpacked 40.58KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libfftw3f_threads.so.3
23:23:11:WU00:FS00:Unpacked 5.84MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libnvrtc-builtins.so.11.2
23:23:11:WU00:FS00:Unpacked 41.93MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libnvrtc.so.11.2
23:23:11:WU00:FS00:Unpacked 42.06KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenCL.so.1
23:23:11:WU00:FS00:Unpacked 1.01MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMAmoebaCUDA.so
23:23:11:WU00:FS00:Unpacked 928.48KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMAmoebaOpenCL.so
23:23:11:WU00:FS00:Unpacked 604.82KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMAmoebaReference.so
23:23:11:WU00:FS00:Unpacked 448.30KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMAmoeba.so
23:23:11:WU00:FS00:Unpacked 642.85KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMCPU.so
23:23:11:WU00:FS00:Unpacked 42.98KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMCudaCompiler.so
23:23:11:WU00:FS00:Unpacked 2.42MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMCUDA.so
23:23:11:WU00:FS00:Unpacked 105.38KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMDrudeCUDA.so
23:23:11:WU00:FS00:Unpacked 110.77KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMDrudeOpenCL.so
23:23:11:WU00:FS00:Unpacked 65.15KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMDrudeReference.so
23:23:11:WU00:FS00:Unpacked 120.21KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMDrude.so
23:23:11:WU00:FS00:Unpacked 2.46MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMOpenCL.so
23:23:11:WU00:FS00:Unpacked 74.52KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMPME.so
23:23:11:WU00:FS00:Unpacked 119.95KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMRPMDCUDA.so
23:23:11:WU00:FS00:Unpacked 121.34KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMRPMDOpenCL.so
23:23:11:WU00:FS00:Unpacked 50.95KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMRPMDReference.so
23:23:11:WU00:FS00:Unpacked 90.55KiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMMRPMD.so
23:23:11:WU00:FS00:Unpacked 4.29MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libOpenMM.so.7.7
23:23:11:WU00:FS00:Unpacked 11.49MiB to cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/libstdc++.so.6
23:23:11:WU00:FS00:Starting
23:23:11:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
23:23:11:WU00:FS00:Started FahCore on PID 53
23:23:11:WU00:FS00:Core PID:57
23:23:11:WU00:FS00:FahCore 0x22 started
23:23:12:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
23:23:12:WU00:FS00:Starting
23:23:12:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
23:23:12:WU00:FS00:Started FahCore on PID 58
23:23:12:WU00:FS00:Core PID:62
23:23:12:WU00:FS00:FahCore 0x22 started
23:23:12:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
23:23:45:WU01:FS01:Starting
23:23:45:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:23:45:WU01:FS01:Started FahCore on PID 63
23:23:45:WU01:FS01:Core PID:67
23:23:45:WU01:FS01:FahCore 0xa8 started
23:23:46:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:24:12:WU00:FS00:Starting
23:24:12:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit/22-0.0.20/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
23:24:12:WU00:FS00:Started FahCore on PID 68
23:24:12:WU00:FS00:Core PID:72
23:24:12:WU00:FS00:FahCore 0x22 started
23:24:12:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
23:24:45:WU01:FS01:Starting
23:24:45:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:24:45:WU01:FS01:Started FahCore on PID 73
23:24:45:WU01:FS01:Core PID:77
23:24:45:WU01:FS01:FahCore 0xa8 started
23:24:46:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:24:46:WARNING:WU01:FS01:Too many errors, failing
23:24:46:WU01:FS01:Sending unit results: id:01 state:SEND error:FAILED project:16996 run:13 clone:154 gen:191 core:0xa8 unit:0x0000009a000000bf000042640000000d
23:24:46:WU01:FS01:Connecting to 129.32.209.205:8080
23:24:46:WU01:FS01:Server responded WORK_ACK (400)
23:24:46:WU01:FS01:Cleaning up
23:24:46:WU02:FS01:Connecting to assign1.foldingathome.org:80
23:24:47:WU02:FS01:Assigned to work server 129.32.209.205
23:24:47:WU02:FS01:Requesting new work unit for slot 01: cpu:7 from 129.32.209.205
23:24:47:WU02:FS01:Connecting to 129.32.209.205:8080
23:24:47:WU02:FS01:Downloading 1.47MiB
23:24:49:WU02:FS01:Download complete
23:24:49:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:16959 run:19 clone:420 gen:349 core:0xa8 unit:0x000001a40000015d0000423f00000013
23:24:49:WU02:FS01:Starting
23:24:49:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:24:49:WU02:FS01:Started FahCore on PID 78
23:24:49:WU02:FS01:Core PID:82
23:24:49:WU02:FS01:FahCore 0xa8 started
23:24:49:WARNING:WU02:FS01:FahCore returned: FAILED_3 (255 = 0xff)
23:24:50:WU02:FS01:Starting
23:24:50:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /fah/cores/cores.foldingathome.org/lin/64bit-avx-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 23 -checkpoint 15 -np 7
23:24:50:WU02:FS01:Started FahCore on PID 83
23:24:50:WU02:FS01:Core PID:87
23:24:50:WU02:FS01:FahCore 0xa8 started
23:24:50:WARNING:WU02:FS01:FahCore returned: FAILED_3 (255 = 0xff)
-- etc etc --
Do you have any idea on what it could be ? Sadly the `--verbosity` flag doesn't really change anything ?

Many thanks !
Frank
frankkkkk
Posts: 3
Joined: Sun Mar 05, 2023 7:16 pm

Re: FahCore returned: FAILED_3

Post by frankkkkk »

Okay, so I think that this is due to the fact that the mounted `/fah` directory has the noexec flag and that some core downloads need to be executed
frankkkkk
Posts: 3
Joined: Sun Mar 05, 2023 7:16 pm

Re: FahCore returned: FAILED_3

Post by frankkkkk »

Yeah, that was the problem: the mount point had the noexec flag and thus the core would not start.. Such a bizarre error message. I'll try to make a PR someday :-)
Thanks !
Post Reply