140.163.4.200 FahCore returned: FAILED_3 (255 = 0xff)

Moderators: Site Moderators, FAHC Science Team

Post Reply
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

140.163.4.200 FahCore returned: FAILED_3 (255 = 0xff)

Post by JimF »

Code: Select all

05:52:44:WU01:FS01:Started FahCore on PID 435073
05:52:44:WU01:FS01:Core PID:435077
05:52:44:WU01:FS01:FahCore 0x22 started
05:52:44:WU01:FS01:0x22:*********************** Log Started 2021-04-07T05:52:44Z ***********************
05:52:44:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
05:52:44:WU01:FS01:0x22:       Core: Core22
05:52:44:WU01:FS01:0x22:       Type: 0x22
05:52:44:WU01:FS01:0x22:    Version: 0.0.13
05:52:44:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:52:44:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
05:52:44:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
05:52:44:WU01:FS01:0x22:       Date: Sep 19 2020
05:52:44:WU01:FS01:0x22:       Time: 01:10:35
05:52:44:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
05:52:44:WU01:FS01:0x22:     Branch: core22-0.0.13
05:52:44:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
05:52:44:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
05:52:44:WU01:FS01:0x22:             -funroll-loops -DOPENMM_GIT_HASH="\"189320d0\""
05:52:44:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
05:52:44:WU01:FS01:0x22:       Bits: 64
05:52:44:WU01:FS01:0x22:       Mode: Release
05:52:44:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
05:52:44:WU01:FS01:0x22:             <peastman@stanford.edu>
05:52:44:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 435073 -checkpoint 15
05:52:44:WU01:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
05:52:44:WU01:FS01:0x22:             nvidia -gpu 0 -gpu-usage 100
05:52:44:WU01:FS01:0x22:************************************ libFAH ************************************
05:52:44:WU01:FS01:0x22:       Date: Sep 15 2020
05:52:44:WU01:FS01:0x22:       Time: 05:14:43
05:52:44:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
05:52:44:WU01:FS01:0x22:     Branch: HEAD
05:52:44:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
05:52:44:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
05:52:44:WU01:FS01:0x22:             -funroll-loops
05:52:44:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
05:52:44:WU01:FS01:0x22:       Bits: 64
05:52:44:WU01:FS01:0x22:       Mode: Release
05:52:44:WU01:FS01:0x22:************************************ CBang *************************************
05:52:44:WU01:FS01:0x22:       Date: Sep 15 2020
05:52:44:WU01:FS01:0x22:       Time: 05:11:04
05:52:44:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
05:52:44:WU01:FS01:0x22:     Branch: HEAD
05:52:44:WU01:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
05:52:44:WU01:FS01:0x22:    Options: -std=c++11 -fsigned-char -ffunction-sections -fdata-sections -O3
05:52:44:WU01:FS01:0x22:             -funroll-loops -fPIC
05:52:44:WU01:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
05:52:44:WU01:FS01:0x22:       Bits: 64
05:52:44:WU01:FS01:0x22:       Mode: Release
05:52:44:WU01:FS01:0x22:************************************ System ************************************
05:52:44:WU01:FS01:0x22:        CPU: Intel(R) Core(TM) i9-10900F CPU @ 2.80GHz
05:52:44:WU01:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 165 Stepping 5
05:52:44:WU01:FS01:0x22:       CPUs: 20
05:52:44:WU01:FS01:0x22:     Memory: 31.27GiB
05:52:44:WU01:FS01:0x22:Free Memory: 23.78GiB
05:52:44:WU01:FS01:0x22:    Threads: POSIX_THREADS
05:52:44:WU01:FS01:0x22: OS Version: 5.8
05:52:44:WU01:FS01:0x22:Has Battery: false
05:52:44:WU01:FS01:0x22: On Battery: false
05:52:44:WU01:FS01:0x22: UTC Offset: -4
05:52:44:WU01:FS01:0x22:        PID: 435077
05:52:44:WU01:FS01:0x22:        CWD: /var/snap/folding-at-home-fcole90/common/work
05:52:44:WU01:FS01:0x22:************************************ OpenMM ************************************
05:52:44:WU01:FS01:0x22:   Revision: 189320d0
05:52:44:WU01:FS01:0x22:********************************************************************************
05:52:44:WU01:FS01:0x22:Project: 17800 (Run 24, Clone 179, Gen 87)
05:52:44:WU01:FS01:0x22:Unit: 0x00000000000000000000000000000000
05:52:44:WU01:FS01:0x22:Reading tar file core.xml
05:52:44:WU01:FS01:0x22:Reading tar file integrator.xml.bz2
05:52:44:WU01:FS01:0x22:Reading tar file state.xml.bz2
05:52:44:WU01:FS01:0x22:Reading tar file system.xml.bz2
05:52:44:WU01:FS01:0x22:Digital signatures verified
05:52:44:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
05:52:44:WU01:FS01:0x22:Version 0.0.13
05:52:44:WU01:FS01:0x22:  Checkpoint write interval: 250000 steps (5%) [20 total]
05:52:44:WU01:FS01:0x22:  JSON viewer frame write interval: 50000 steps (1%) [100 total]
05:52:44:WU01:FS01:0x22:  XTC frame write interval: 25000 steps (0.5%) [200 total]
05:52:44:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
05:52:44:WU01:FS01:0x22:There are 4 platforms available.
05:52:44:WU01:FS01:0x22:Platform 0: Reference
05:52:44:WU01:FS01:0x22:Platform 1: CPU
05:52:44:WU01:FS01:0x22:Platform 2: OpenCL
05:52:44:WU01:FS01:0x22:  opencl-device 0 specified
05:52:44:WU01:FS01:0x22:Platform 3: CUDA
05:52:44:WU01:FS01:0x22:  cuda-device 0 specified
05:52:45:WU00:FS01:Upload complete
05:52:45:WU00:FS01:Server responded WORK_ACK (400)
05:52:45:WU00:FS01:Final credit estimate, 71297.00 points
05:52:45:WU00:FS01:Cleaning up
05:52:45:WU01:FS01:0x22:Attempting to create CUDA context:
05:52:45:WU01:FS01:0x22:  Configuring platform CUDA
05:52:46:WU01:FS01:0x22:  Using CUDA and gpu 0
05:52:46:WU01:FS01:0x22:Completed 0 out of 5000000 steps (0%)

...............................................................................
08:07:25:WU01:FS01:Connecting to assign1.foldingathome.org:80
08:07:26:WU01:FS01:Assigned to work server 140.163.4.200
08:07:26:WU01:FS01:Requesting new work unit for slot 01: gpu:1:0 GP104 [GeForce GTX 1070] 6463 from 140.163.4.200
08:07:26:WU01:FS01:Connecting to 140.163.4.200:8080
08:07:26:WU01:FS01:Downloading 20.25MiB
08:07:28:WU01:FS01:Download complete
08:07:28:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:17340 run:0 clone:450 gen:96 core:0x22 unit:0x000001c200000060000043bc00000000
08:08:01:WU00:FS01:0x22:Completed 1250000 out of 1250000 steps (100%)
08:08:01:WU00:FS01:0x22:Average performance: 46.9565 ns/day
08:08:01:WU00:FS01:0x22:Checkpoint completed at step 1250000
08:08:03:WU00:FS01:0x22:Saving result file ../logfile_01.txt
08:08:03:WU00:FS01:0x22:Saving result file checkpointIntegrator.xml
08:08:03:WU00:FS01:0x22:Saving result file checkpointState.xml
08:08:03:WU00:FS01:0x22:Saving result file positions.xtc
08:08:04:WU00:FS01:0x22:Saving result file science.log
08:08:04:WU00:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
08:08:04:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
08:08:04:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:17737 run:49 clone:3 gen:77 core:0x22 unit:0x000000030000004d0000454900000031
08:08:04:WU00:FS01:Uploading 8.54MiB to 128.174.73.74
08:08:04:WU00:FS01:Connecting to 128.174.73.74:8080
08:08:04:WU01:FS01:Starting
08:08:04:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:08:04:WU01:FS01:Started FahCore on PID 797560
08:08:04:WU01:FS01:FahCore 0x22 started
08:08:04:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:08:05:WU01:FS01:Starting
08:08:05:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:08:05:WU01:FS01:Started FahCore on PID 797562
08:08:05:WU01:FS01:FahCore 0x22 started
08:08:05:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:08:10:WU00:FS01:Upload complete
08:08:10:WU00:FS01:Server responded WORK_ACK (400)
08:08:10:WU00:FS01:Final credit estimate, 50552.00 points
08:08:10:WU00:FS01:Cleaning up
08:09:05:WU01:FS01:Starting
08:09:05:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:09:05:WU01:FS01:Started FahCore on PID 800233
08:09:05:WU01:FS01:FahCore 0x22 started
08:09:05:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:10:05:WU01:FS01:Starting
08:10:05:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:10:05:WU01:FS01:Started FahCore on PID 802906
08:10:05:WU01:FS01:FahCore 0x22 started
08:10:05:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:11:05:WU01:FS01:Starting
08:11:05:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:11:05:WU01:FS01:Started FahCore on PID 805577
08:11:05:WU01:FS01:FahCore 0x22 started
08:11:05:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:11:05:WARNING:WU01:FS01:Too many errors, failing
08:11:05:WU01:FS01:Sending unit results: id:01 state:SEND error:FAILED project:17340 run:0 clone:450 gen:96 core:0x22 unit:0x000001c200000060000043bc00000000
08:11:05:WU01:FS01:Connecting to 140.163.4.200:8080
08:11:05:WU01:FS01:Server responded WORK_ACK (400)
08:11:05:WU01:FS01:Cleaning up
08:11:05:WU00:FS01:Connecting to assign1.foldingathome.org:80
08:11:06:WU00:FS01:Assigned to work server 140.163.4.200
08:11:06:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 GP104 [GeForce GTX 1070] 6463 from 140.163.4.200
08:11:06:WU00:FS01:Connecting to 140.163.4.200:8080
08:11:06:WU00:FS01:Downloading 20.24MiB
08:11:08:WU00:FS01:Download complete
08:11:08:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:17340 run:13 clone:150 gen:96 core:0x22 unit:0x0000009600000060000043bc0000000d
08:11:08:WU00:FS01:Starting
08:11:08:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:11:08:WU00:FS01:Started FahCore on PID 805812
08:11:08:WU00:FS01:FahCore 0x22 started
08:11:09:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:11:09:WU00:FS01:Starting
08:11:09:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:11:09:WU00:FS01:Started FahCore on PID 805814
08:11:09:WU00:FS01:FahCore 0x22 started
08:11:09:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:12:09:WU00:FS01:Starting
08:12:09:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:12:09:WU00:FS01:Started FahCore on PID 808485
08:12:09:WU00:FS01:FahCore 0x22 started
08:12:09:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:13:09:WU00:FS01:Starting
08:13:09:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:13:09:WU00:FS01:Started FahCore on PID 811156
08:13:09:WU00:FS01:FahCore 0x22 started
08:13:09:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:14:09:WU00:FS01:Starting
08:14:09:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 995 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
08:14:09:WU00:FS01:Started FahCore on PID 813827
08:14:09:WU00:FS01:FahCore 0x22 started
08:14:09:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
08:14:09:WARNING:WU00:FS01:Too many errors, failing
08:14:09:WU00:FS01:Sending unit results: id:00 state:SEND error:FAILED project:17340 run:13 clone:150 gen:96 core:0x22 unit:0x0000009600000060000043bc0000000d
08:14:09:WU00:FS01:Connecting to 140.163.4.200:8080
08:14:09:WU00:FS01:Server responded WORK_ACK (400)
08:14:09:WU00:FS01:Cleaning up
After a reboot, it is working OK now.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.200 FahCore returned: FAILED_3 (255 = 0xff)

Post by bruce »

This log makes no sense. You've somehow gotten two FAHCores working on two WUs in the same slot.

140.163.4.200 is the IP address of a server which can download a new WU. The WU, itself will be assigned to an empty slot and a FAHCore started. The WU specifies which FAHCore to start. If something is wrong, the FAHCore detects the processing error and issues a failure message. The server's iP address doesn't detect processing errors.

Code: Select all

05:52:46:WU01:FS01:0x22:Completed 0 out of 5000000 steps (0%)

...............................................................................
08:07:25:WU01:FS01:Connecting to assign1.foldingathome.org:80
This suggests that WU01 in slot 01 is processing a WU using FAHCore_22 and while doing so, FAHClient decides that FS01 is empty so a new WU is downloaded into that slot. Somehow you started a second copy of FAHClient without the first one ending.
Post Reply