WU 13414 hangs forever on RX5700XT
Posted: Wed Jul 01, 2020 1:56 pm
For some reason, all of the 13414 work units I've received have locked up my RX5700XT card - see the log below for an example where nothing happened for almost 13 hours. The only way I'm able to get things working again is to manually dump the WU.
Code: Select all
00:25:00:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13414 run:3534 clone:50 gen:1 core:0x22 unit:0x0000000512bc7d9a5ef50d675ea5a9cf
00:25:04:WU00:FS01:FahCore 0x22 started
00:25:04:WU00:FS01:0x22:*********************** Log Started 2020-06-29T00:25:04Z ***********************
00:25:04:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
00:25:04:WU00:FS01:0x22: Core: Core22
00:25:04:WU00:FS01:0x22: Type: 0x22
00:25:04:WU00:FS01:0x22: Version: 0.0.10
00:25:04:WU00:FS01:0x22: Author: Joseph Coffland <joseph@cauldrondevelopment.com>
00:25:04:WU00:FS01:0x22: Copyright: 2020 foldingathome.org
00:25:04:WU00:FS01:0x22: Homepage: https://foldingathome.org/
00:25:04:WU00:FS01:0x22: Date: Jun 16 2020
00:25:04:WU00:FS01:0x22: Time: 15:55:31
00:25:04:WU00:FS01:0x22: Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
00:25:04:WU00:FS01:0x22: Branch: core22-0.0.10
00:25:04:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
00:25:04:WU00:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
00:25:04:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
00:25:04:WU00:FS01:0x22: Bits: 64
00:25:04:WU00:FS01:0x22: Mode: Release
00:25:04:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
00:25:04:WU00:FS01:0x22: <peastman@stanford.edu>
00:25:04:WU00:FS01:0x22: Args: -dir 00 -suffix 01 -version 706 -lifeline 25434 -checkpoint 15
00:25:04:WU00:FS01:0x22: -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
00:25:04:WU00:FS01:0x22:************************************ libFAH ************************************
00:25:04:WU00:FS01:0x22: Date: Jun 2 2020
00:25:04:WU00:FS01:0x22: Time: 00:07:31
00:25:04:WU00:FS01:0x22: Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
00:25:04:WU00:FS01:0x22: Branch: HEAD
00:25:04:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
00:25:04:WU00:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
00:25:04:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
00:25:04:WU00:FS01:0x22: Bits: 64
00:25:04:WU00:FS01:0x22: Mode: Release
00:25:04:WU00:FS01:0x22:************************************ CBang *************************************
00:25:04:WU00:FS01:0x22: Date: May 31 2020
00:25:04:WU00:FS01:0x22: Time: 20:16:34
00:25:04:WU00:FS01:0x22: Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
00:25:04:WU00:FS01:0x22: Branch: HEAD
00:25:04:WU00:FS01:0x22: Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
00:25:04:WU00:FS01:0x22: Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
00:25:04:WU00:FS01:0x22: -fPIC
00:25:04:WU00:FS01:0x22: Platform: linux2 4.19.76-linuxkit
00:25:04:WU00:FS01:0x22: Bits: 64
00:25:04:WU00:FS01:0x22: Mode: Release
00:25:04:WU00:FS01:0x22:************************************ System ************************************
00:25:04:WU00:FS01:0x22: CPU: AMD Ryzen 9 3900X 12-Core Processor
00:25:04:WU00:FS01:0x22: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
00:25:04:WU00:FS01:0x22: CPUs: 24
00:25:04:WU00:FS01:0x22: Memory: 31.37GiB
00:25:04:WU00:FS01:0x22:Free Memory: 24.47GiB
00:25:04:WU00:FS01:0x22: Threads: POSIX_THREADS
00:25:04:WU00:FS01:0x22: OS Version: 5.4
00:25:04:WU00:FS01:0x22:Has Battery: false
00:25:04:WU00:FS01:0x22: On Battery: false
00:25:04:WU00:FS01:0x22: UTC Offset: -4
00:25:04:WU00:FS01:0x22: PID: 25438
00:25:04:WU00:FS01:0x22: CWD: /var/lib/fahclient/work
00:25:04:WU00:FS01:0x22:********************************************************************************
00:25:04:WU00:FS01:0x22:Project: 13414 (Run 3534, Clone 50, Gen 1)
00:25:04:WU00:FS01:0x22:Unit: 0x0000000512bc7d9a5ef50d675ea5a9cf
00:25:04:WU00:FS01:0x22:Reading tar file core.xml
00:25:04:WU00:FS01:0x22:Reading tar file integrator.xml
00:25:04:WU00:FS01:0x22:Reading tar file state.xml
00:25:04:WU00:FS01:0x22:Reading tar file system.xml
00:25:04:WU00:FS01:0x22:Digital signatures verified
00:25:04:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
00:25:04:WU00:FS01:0x22:Version 0.0.10
00:25:05:WU00:FS01:0x22: Checkpoint write interval: 50000 steps (5%) [20 total]
00:25:05:WU00:FS01:0x22: JSON viewer frame write interval: 10000 steps (1%) [100 total]
00:25:05:WU00:FS01:0x22: XTC frame write interval: 250000 steps (25%) [4 total]
00:25:05:WU00:FS01:0x22: Global context and integrator variables write interval: 250 steps (0.025%) [4000 total]
13:12:12:WU00:FS01:0x22:Caught signal SIGINT(2) on PID 25438
13:12:12:WU00:FS01:0x22:Exiting, please wait. . .
13:13:14:WU00:Sending unit results: id:00 state:SEND error:DUMPED project:13414 run:3534 clone:50 gen:1 core:0x22 unit:0x0000000512bc7d9a5ef50d675ea5a9cf