BAD_WORK_UNIT on p14464, p14456, p14465

Moderators: Site Moderators, FAHC Science Team

BAD_WORK_UNIT on p14464, p14456, p14465

Postby uyaem » Fri Jul 03, 2020 7:33 pm

Over night, I downloaded three WUs and all aborted immediately. In each case, they were processed by OPENMM_22 version 0.0.10 (not the very latest 0.0.11).
I was asked to provide this info on the forums.

Project 14464 (Run 0, Clone 502, Gen 49)
Project 14456 (Run 0, Clone 708, Gen 32)
Project 14465 (Run 0, Clone 1770, Gen 14)

Root cause each time
Code: Select all
ERROR:exception: Called setPositions() on a Context with the wrong number of positions

Hardware is a 4 weeks old GTX 1660 Super.

Here are the logs for reference:

Code: Select all
07:36:11:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:14464 run:0 clone:502 gen:49 core:0x22 unit:0x0000004f03854c135eb98532e1975fa2
07:36:11:WU02:FS01:Starting
07:36:11:WU02:FS01:Running FahCore: \"C:\\Program Files (x86)\\FAHClient/FAHCoreWrapper.exe\" C:\\Users\\X\\AppData\\Roaming\\FAHClient\\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 02 -suffix 01 -version 706 -lifeline 11704 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
07:36:11:WU02:FS01:Started FahCore on PID 9700
07:36:11:WU02:FS01:Core PID:13420
07:36:11:WU02:FS01:FahCore 0x22 started
07:36:11:WU02:FS01:0x22:*********************** Log Started 2020-07-01T07:36:11Z ***********************
07:36:11:WU02:FS01:0x22:*************************** Core22 Folding@home Core ***************************
07:36:11:WU02:FS01:0x22:       Core: Core22
07:36:11:WU02:FS01:0x22:       Type: 0x22
07:36:11:WU02:FS01:0x22:    Version: 0.0.10
07:36:11:WU02:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:36:11:WU02:FS01:0x22:  Copyright: 2020 foldingathome.org
07:36:11:WU02:FS01:0x22:   Homepage: https://foldingathome.org/
07:36:11:WU02:FS01:0x22:       Date: Jun 16 2020
07:36:11:WU02:FS01:0x22:       Time: 14:33:22
07:36:11:WU02:FS01:0x22:   Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
07:36:11:WU02:FS01:0x22:     Branch: core22-0.0.10
07:36:11:WU02:FS01:0x22:   Compiler: Visual C++ 2015
07:36:11:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:36:11:WU02:FS01:0x22:   Platform: win32 10
07:36:11:WU02:FS01:0x22:       Bits: 64
07:36:11:WU02:FS01:0x22:       Mode: Release
07:36:11:WU02:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
07:36:11:WU02:FS01:0x22:             <peastman@stanford.edu>
07:36:11:WU02:FS01:0x22:       Args: -dir 02 -suffix 01 -version 706 -lifeline 9700 -checkpoint 15
07:36:11:WU02:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
07:36:11:WU02:FS01:0x22:             0 -gpu 0
07:36:11:WU02:FS01:0x22:************************************ libFAH ************************************
07:36:11:WU02:FS01:0x22:       Date: Jun 15 2020
07:36:11:WU02:FS01:0x22:       Time: 18:05:04
07:36:11:WU02:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
07:36:11:WU02:FS01:0x22:     Branch: HEAD
07:36:11:WU02:FS01:0x22:   Compiler: Visual C++ 2015
07:36:11:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:36:11:WU02:FS01:0x22:   Platform: win32 10
07:36:11:WU02:FS01:0x22:       Bits: 64
07:36:11:WU02:FS01:0x22:       Mode: Release
07:36:11:WU02:FS01:0x22:************************************ CBang *************************************
07:36:11:WU02:FS01:0x22:       Date: Jun 16 2020
07:36:11:WU02:FS01:0x22:       Time: 14:31:33
07:36:11:WU02:FS01:0x22:   Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
07:36:11:WU02:FS01:0x22:     Branch: HEAD
07:36:11:WU02:FS01:0x22:   Compiler: Visual C++ 2015
07:36:11:WU02:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
07:36:11:WU02:FS01:0x22:   Platform: win32 10
07:36:11:WU02:FS01:0x22:       Bits: 64
07:36:11:WU02:FS01:0x22:       Mode: Release
07:36:11:WU02:FS01:0x22:************************************ System ************************************
07:36:11:WU02:FS01:0x22:        CPU: AMD Ryzen 9 3900X 12-Core Processor
07:36:11:WU02:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
07:36:11:WU02:FS01:0x22:       CPUs: 24
07:36:11:WU02:FS01:0x22:     Memory: 31.95GiB
07:36:11:WU02:FS01:0x22:Free Memory: 24.47GiB
07:36:11:WU02:FS01:0x22:    Threads: WINDOWS_THREADS
07:36:11:WU02:FS01:0x22: OS Version: 6.2
07:36:11:WU02:FS01:0x22:Has Battery: false
07:36:11:WU02:FS01:0x22: On Battery: false
07:36:11:WU02:FS01:0x22: UTC Offset: 2
07:36:11:WU02:FS01:0x22:        PID: 13420
07:36:11:WU02:FS01:0x22:        CWD: C:\\Users\\X\\AppData\\Roaming\\FAHClient\\work
07:36:11:WU02:FS01:0x22:********************************************************************************
07:36:11:WU02:FS01:0x22:Project: 14464 (Run 0, Clone 502, Gen 49)
07:36:11:WU02:FS01:0x22:Unit: 0x0000004f03854c135eb98532e1975fa2
07:36:11:WU02:FS01:0x22:Reading tar file core.xml
07:36:11:WU02:FS01:0x22:Reading tar file integrator.xml
07:36:11:WU02:FS01:0x22:Reading tar file state.xml
07:36:12:WU02:FS01:0x22:Reading tar file system.xml
07:36:13:WU02:FS01:0x22:Digital signatures verified
07:36:13:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
07:36:13:WU02:FS01:0x22:Version 0.0.10
07:36:13:WU02:FS01:0x22:  Checkpoint write interval: 100000 steps (5%) [20 total]
07:36:13:WU02:FS01:0x22:  JSON viewer frame write interval: 20000 steps (1%) [100 total]
07:36:13:WU02:FS01:0x22:  XTC frame write interval: 20000 steps (1%) [100 total]
07:36:13:WU02:FS01:0x22:  Global context and integrator variables write interval: disabled
07:36:31:WU02:FS01:0x22:ERROR:exception: Called setPositions() on a Context with the wrong number of positions
07:36:31:WU02:FS01:0x22:Saving result file ..\\logfile_01.txt
07:36:31:WU02:FS01:0x22:Saving result file science.log
07:36:31:WU02:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
07:36:32:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
07:36:32:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:14464 run:0 clone:502 gen:49 core:0x22 unit:0x0000004f03854c135eb98532e1975fa2


Code: Select all
15:01:39:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:14456 run:0 clone:708 gen:32 core:0x22 unit:0x0000003b03854c135eb39a2f311d0c95
15:01:39:WU00:FS01:Starting
15:01:39:WU00:FS01:Running FahCore: \"C:\\Program Files (x86)\\FAHClient/FAHCoreWrapper.exe\" C:\\Users\\X\\AppData\\Roaming\\FAHClient\\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 11704 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
15:01:39:WU00:FS01:Started FahCore on PID 15924
15:01:39:WU00:FS01:Core PID:10188
15:01:39:WU00:FS01:FahCore 0x22 started
15:01:40:WU00:FS01:0x22:*********************** Log Started 2020-07-01T15:01:39Z ***********************
15:01:40:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
15:01:40:WU00:FS01:0x22:       Core: Core22
15:01:40:WU00:FS01:0x22:       Type: 0x22
15:01:40:WU00:FS01:0x22:    Version: 0.0.10
15:01:40:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:01:40:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
15:01:40:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
15:01:40:WU00:FS01:0x22:       Date: Jun 16 2020
15:01:40:WU00:FS01:0x22:       Time: 14:33:22
15:01:40:WU00:FS01:0x22:   Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
15:01:40:WU00:FS01:0x22:     Branch: core22-0.0.10
15:01:40:WU00:FS01:0x22:   Compiler: Visual C++ 2015
15:01:40:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:01:40:WU00:FS01:0x22:   Platform: win32 10
15:01:40:WU00:FS01:0x22:       Bits: 64
15:01:40:WU00:FS01:0x22:       Mode: Release
15:01:40:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:01:40:WU00:FS01:0x22:             <peastman@stanford.edu>
15:01:40:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 15924 -checkpoint 15
15:01:40:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
15:01:40:WU00:FS01:0x22:             0 -gpu 0
15:01:40:WU00:FS01:0x22:************************************ libFAH ************************************
15:01:40:WU00:FS01:0x22:       Date: Jun 15 2020
15:01:40:WU00:FS01:0x22:       Time: 18:05:04
15:01:40:WU00:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
15:01:40:WU00:FS01:0x22:     Branch: HEAD
15:01:40:WU00:FS01:0x22:   Compiler: Visual C++ 2015
15:01:40:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:01:40:WU00:FS01:0x22:   Platform: win32 10
15:01:40:WU00:FS01:0x22:       Bits: 64
15:01:40:WU00:FS01:0x22:       Mode: Release
15:01:40:WU00:FS01:0x22:************************************ CBang *************************************
15:01:40:WU00:FS01:0x22:       Date: Jun 16 2020
15:01:40:WU00:FS01:0x22:       Time: 14:31:33
15:01:40:WU00:FS01:0x22:   Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
15:01:40:WU00:FS01:0x22:     Branch: HEAD
15:01:40:WU00:FS01:0x22:   Compiler: Visual C++ 2015
15:01:40:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:01:40:WU00:FS01:0x22:   Platform: win32 10
15:01:40:WU00:FS01:0x22:       Bits: 64
15:01:40:WU00:FS01:0x22:       Mode: Release
15:01:40:WU00:FS01:0x22:************************************ System ************************************
15:01:40:WU00:FS01:0x22:        CPU: AMD Ryzen 9 3900X 12-Core Processor
15:01:40:WU00:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
15:01:40:WU00:FS01:0x22:       CPUs: 24
15:01:40:WU00:FS01:0x22:     Memory: 31.95GiB
15:01:40:WU00:FS01:0x22:Free Memory: 22.96GiB
15:01:40:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
15:01:40:WU00:FS01:0x22: OS Version: 6.2
15:01:40:WU00:FS01:0x22:Has Battery: false
15:01:40:WU00:FS01:0x22: On Battery: false
15:01:40:WU00:FS01:0x22: UTC Offset: 2
15:01:40:WU00:FS01:0x22:        PID: 10188
15:01:40:WU00:FS01:0x22:        CWD: C:\\Users\\X\\AppData\\Roaming\\FAHClient\\work
15:01:40:WU00:FS01:0x22:********************************************************************************
15:01:40:WU00:FS01:0x22:Project: 14456 (Run 0, Clone 708, Gen 32)
15:01:40:WU00:FS01:0x22:Unit: 0x0000003b03854c135eb39a2f311d0c95
15:01:40:WU00:FS01:0x22:Reading tar file core.xml
15:01:40:WU00:FS01:0x22:Reading tar file integrator.xml
15:01:40:WU00:FS01:0x22:Reading tar file state.xml
15:01:41:WU00:FS01:0x22:Reading tar file system.xml
15:01:42:WU00:FS01:0x22:Digital signatures verified
15:01:42:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
15:01:42:WU00:FS01:0x22:Version 0.0.10
15:01:43:WU00:FS01:0x22:  Checkpoint write interval: 100000 steps (5%) [20 total]
15:01:43:WU00:FS01:0x22:  JSON viewer frame write interval: 20000 steps (1%) [100 total]
15:01:43:WU00:FS01:0x22:  XTC frame write interval: 20000 steps (1%) [100 total]
15:01:43:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
15:02:01:WU00:FS01:0x22:ERROR:exception: Called setPositions() on a Context with the wrong number of positions
15:02:01:WU00:FS01:0x22:Saving result file ..\\logfile_01.txt
15:02:01:WU00:FS01:0x22:Saving result file science.log
15:02:01:WU00:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
15:02:01:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:02:01:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:14456 run:0 clone:708 gen:32 core:0x22 unit:0x0000003b03854c135eb39a2f311d0c95


Code: Select all
17:33:04:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:14465 run:0 clone:1770 gen:14 core:0x22 unit:0x0000001403854c135eb98531a758b58c
17:33:04:WU01:FS01:Starting
17:33:04:WU01:FS01:Running FahCore: \"C:\\Program Files (x86)\\FAHClient/FAHCoreWrapper.exe\" C:\\Users\\X\\AppData\\Roaming\\FAHClient\\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 11704 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
17:33:04:WU01:FS01:Started FahCore on PID 10144
17:33:04:WU01:FS01:Core PID:7956
17:33:04:WU01:FS01:FahCore 0x22 started
17:33:04:WU01:FS01:0x22:*********************** Log Started 2020-07-01T17:33:04Z ***********************
17:33:04:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
17:33:04:WU01:FS01:0x22:       Core: Core22
17:33:04:WU01:FS01:0x22:       Type: 0x22
17:33:04:WU01:FS01:0x22:    Version: 0.0.10
17:33:04:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:33:04:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
17:33:04:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
17:33:04:WU01:FS01:0x22:       Date: Jun 16 2020
17:33:04:WU01:FS01:0x22:       Time: 14:33:22
17:33:04:WU01:FS01:0x22:   Revision: 147051aad40bcbec7d4b25105bbedfab425f1dc2
17:33:04:WU01:FS01:0x22:     Branch: core22-0.0.10
17:33:04:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:33:04:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:33:04:WU01:FS01:0x22:   Platform: win32 10
17:33:04:WU01:FS01:0x22:       Bits: 64
17:33:04:WU01:FS01:0x22:       Mode: Release
17:33:04:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:33:04:WU01:FS01:0x22:             <peastman@stanford.edu>
17:33:04:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10144 -checkpoint 15
17:33:04:WU01:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
17:33:04:WU01:FS01:0x22:             0 -gpu 0
17:33:04:WU01:FS01:0x22:************************************ libFAH ************************************
17:33:04:WU01:FS01:0x22:       Date: Jun 15 2020
17:33:04:WU01:FS01:0x22:       Time: 18:05:04
17:33:04:WU01:FS01:0x22:   Revision: 2b383f4f04f38511dff592885d7c0400e72bdf43
17:33:04:WU01:FS01:0x22:     Branch: HEAD
17:33:04:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:33:04:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:33:04:WU01:FS01:0x22:   Platform: win32 10
17:33:04:WU01:FS01:0x22:       Bits: 64
17:33:04:WU01:FS01:0x22:       Mode: Release
17:33:04:WU01:FS01:0x22:************************************ CBang *************************************
17:33:04:WU01:FS01:0x22:       Date: Jun 16 2020
17:33:04:WU01:FS01:0x22:       Time: 14:31:33
17:33:04:WU01:FS01:0x22:   Revision: 75fcee0b8e713cb47f5191a3689d5f4f07244c7f
17:33:04:WU01:FS01:0x22:     Branch: HEAD
17:33:04:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:33:04:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:33:04:WU01:FS01:0x22:   Platform: win32 10
17:33:04:WU01:FS01:0x22:       Bits: 64
17:33:04:WU01:FS01:0x22:       Mode: Release
17:33:04:WU01:FS01:0x22:************************************ System ************************************
17:33:04:WU01:FS01:0x22:        CPU: AMD Ryzen 9 3900X 12-Core Processor
17:33:04:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
17:33:04:WU01:FS01:0x22:       CPUs: 24
17:33:04:WU01:FS01:0x22:     Memory: 31.95GiB
17:33:04:WU01:FS01:0x22:Free Memory: 22.50GiB
17:33:04:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
17:33:04:WU01:FS01:0x22: OS Version: 6.2
17:33:04:WU01:FS01:0x22:Has Battery: false
17:33:04:WU01:FS01:0x22: On Battery: false
17:33:04:WU01:FS01:0x22: UTC Offset: 2
17:33:04:WU01:FS01:0x22:        PID: 7956
17:33:04:WU01:FS01:0x22:        CWD: C:\\Users\\X\\AppData\\Roaming\\FAHClient\\work
17:33:04:WU01:FS01:0x22:********************************************************************************
17:33:04:WU01:FS01:0x22:Project: 14465 (Run 0, Clone 1770, Gen 14)
17:33:04:WU01:FS01:0x22:Unit: 0x0000001403854c135eb98531a758b58c
17:33:04:WU01:FS01:0x22:Reading tar file core.xml
17:33:04:WU01:FS01:0x22:Reading tar file integrator.xml
17:33:04:WU01:FS01:0x22:Reading tar file state.xml
17:33:05:WU01:FS01:0x22:Reading tar file system.xml
17:33:07:WU01:FS01:0x22:Digital signatures verified
17:33:07:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
17:33:07:WU01:FS01:0x22:Version 0.0.10
17:33:07:WU01:FS01:0x22:  Checkpoint write interval: 100000 steps (5%) [20 total]
17:33:07:WU01:FS01:0x22:  JSON viewer frame write interval: 20000 steps (1%) [100 total]
17:33:07:WU01:FS01:0x22:  XTC frame write interval: 20000 steps (1%) [100 total]
17:33:07:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
17:33:24:WU01:FS01:0x22:ERROR:exception: Called setPositions() on a Context with the wrong number of positions
17:33:24:WU01:FS01:0x22:Saving result file ..\\logfile_01.txt
17:33:24:WU01:FS01:0x22:Saving result file science.log
17:33:24:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
17:33:25:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
17:33:25:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:14465 run:0 clone:1770 gen:14 core:0x22 unit:0x0000001403854c135eb98531a758b58c
Image
CPU: Ryzen 9 3900X (1x21 CPUs) ~ GPU: nVidia GeForce GTX 1660 Super (Asus)
uyaem
 
Posts: 222
Joined: Sat Mar 21, 2020 8:35 pm
Location: Esslingen, Germany

Re: BAD_WORK_UNIT on p14464, p14456, p14465

Postby JohnChodera » Fri Jul 03, 2020 9:17 pm

Hi folks!

We have a theory of what happened here---an incorrect file path to system.xml was specified, likely due to a transiently duplicated project.xml that didn't have an updated project-id---that caused the wrong state.xml to be mispackaged in WUs for a while.

This appears to be corrected now for the 144xx projects. Please do let us know if you find any new examples from the last few hours!

~ John Chodera // MSKCC
User avatar
JohnChodera
Pande Group Member
 
Posts: 406
Joined: Fri Feb 22, 2013 10:59 pm


Return to Issues with a specific WU

Who is online

Users browsing this forum: No registered users and 2 guests

cron