SOLVED: CPU stuck at Ready, waiting for FahCore Run

Moderators: Site Moderators, FAHC Science Team

SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Thu Jun 11, 2020 6:36 am

Hi all!
As of last night the CPU folding is not working for me, 2 GPUs working just fine.
I'm running Ubuntu and have been trying to kill only the FahCore_a7 process, but not getting any results.

Please advice (pasted log example of error below).
As a result, my disk is filling up as the following line gets printed over and over.....
12441:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167

Thanks!

Code: Select all
381:04:30:50:WU02:FS00:Core PID:3167
382:04:30:50:WU02:FS00:FahCore 0xa7 started
383:04:30:51:WU02:FS00:0xa7:*********************** Log Started 2020-06-11T04:30:50Z ***********************
384:04:30:51:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
385:04:30:51:WU02:FS00:0xa7:       Type: 0xa7
386:04:30:51:WU02:FS00:0xa7:       Core: Gromacs
387:04:30:51:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 706 -lifeline 3163 -checkpoint 15 -np
388:04:30:51:WU02:FS00:0xa7:             22
389:04:30:51:WU02:FS00:0xa7:************************************ CBang *************************************
390:04:30:51:WU02:FS00:0xa7:       Date: Nov 5 2019
391:04:30:51:WU02:FS00:0xa7:       Time: 06:06:57
392:04:30:51:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
393:04:30:51:WU02:FS00:0xa7:     Branch: master
394:04:30:51:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
395:04:30:51:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
396:04:30:51:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
397:04:30:51:WU02:FS00:0xa7:       Bits: 64
398:04:30:51:WU02:FS00:0xa7:       Mode: Release
399:04:30:51:WU02:FS00:0xa7:************************************ System ************************************
400:04:30:51:WU02:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
401:04:30:51:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
402:04:30:51:WU02:FS00:0xa7:       CPUs: 24
403:04:30:51:WU02:FS00:0xa7:     Memory: 15.54GiB
404:04:30:51:WU02:FS00:0xa7:Free Memory: 10.66GiB
405:04:30:51:WU02:FS00:0xa7:    Threads: POSIX_THREADS
406:04:30:51:WU02:FS00:0xa7: OS Version: 5.4
407:04:30:51:WU02:FS00:0xa7:Has Battery: false
408:04:30:51:WU02:FS00:0xa7: On Battery: false
409:04:30:51:WU02:FS00:0xa7: UTC Offset: 2
410:04:30:51:WU02:FS00:0xa7:        PID: 3167
411:04:30:51:WU02:FS00:0xa7:        CWD: /var/lib/fahclient/work
412:04:30:51:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
413:04:30:51:WU02:FS00:0xa7:    Version: 0.0.18
414:04:30:51:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
415:04:30:51:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
416:04:30:51:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
417:04:30:51:WU02:FS00:0xa7:       Date: Nov 5 2019
418:04:30:51:WU02:FS00:0xa7:       Time: 06:13:26
419:04:30:51:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
420:04:30:51:WU02:FS00:0xa7:     Branch: master
421:04:30:51:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
422:04:30:51:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
423:04:30:51:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
424:04:30:51:WU02:FS00:0xa7:       Bits: 64
425:04:30:51:WU02:FS00:0xa7:       Mode: Release
426:04:30:51:WU02:FS00:0xa7:************************************ Build *************************************
427:04:30:51:WU02:FS00:0xa7:       SIMD: avx_256
428:04:30:51:WU02:FS00:0xa7:********************************************************************************
429:04:30:51:WU02:FS00:0xa7:Project: 14524 (Run 706, Clone 5, Gen 17)
430:04:30:51:WU02:FS00:0xa7:Unit: 0x0000001f80fccb0a5e781bc15bdeaaff
431:04:30:51:WU02:FS00:0xa7:Reading tar file core.xml
432:04:30:51:WU02:FS00:0xa7:Reading tar file frame17.tpr
433:04:30:51:WU02:FS00:0xa7:Digital signatures verified
434:04:30:51:WU02:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
435:04:30:51:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpt 15 -nt 21
436:04:30:51:WU02:FS00:0xa7:Steps: first=4250000 total=250000
437:04:30:51:WU02:FS00:0xa7:ERROR:
438:04:30:51:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
439:04:30:51:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
440:04:30:51:WU02:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
441:04:30:51:WU02:FS00:0xa7:ERROR:
442:04:30:51:WU02:FS00:0xa7:ERROR:Fatal error:
443:04:30:51:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
444:04:30:51:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
445:04:30:51:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
446:04:30:51:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
447:04:30:51:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
448:04:30:51:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
449:04:30:55:WU02:FS00:0xa7:WARNING:Unexpected exit() call
450:04:30:55:WU02:FS00:0xa7:WARNING:Unexpected exit from science code
451:04:30:55:WU02:FS00:0xa7:Saving result file ../logfile_01.txt
452:04:30:55:WU02:FS00:0xa7:Saving result file md.log
453:04:30:55:WU02:FS00:0xa7:Saving result file science.log
454:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
455:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
n PID 3167
2340:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2341:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2342:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2343:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2344:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2345:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2346:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167

2789:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2790:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
2
12445:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
12446:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
12447:04:30:55:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 3167
12448:04:30:56:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Thu Jun 11, 2020 6:49 am

What I didn't manage using terminal, I simply opened the GUI for the lovely and power FahControl.
In there I :
1: Removed the CPU slot
2: Saved
3: Added a CPU slot

Voila!

Happy folding everyone!
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Neil-B » Thu Jun 11, 2020 9:06 am

Just as a thought … If you aren't GPU folding then setting your thread count to 24 should avoid future decomposition issues … CPU scheduling is well managed by OS's as a rule and with FAH at lowest priority you should find running all threads on it doesn't impact other processes noticeably … could up the temps a little bit but probably not much given you are already running most threads … will also get you higher points / speed up the science.
1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent, Quadro K420 1GB, FAH 7.6.13
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro, Quadro M1000M 2GB, FAH 7.6.13
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro, GTX 750Ti 2GB, FAH 7.6.13
Neil-B
 
Posts: 1105
Joined: Sun Mar 22, 2020 6:52 pm
Location: UK

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby PantherX » Thu Jun 11, 2020 9:41 am

Can you please post the section of the log where the CPU Slot requests the Servers to get a WU and it was assigned Project: 14524 (Run 706, Clone 5, Gen 17)? I am keen to see what the client asked for and what the Server provided.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6315
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Joe_H » Thu Jun 11, 2020 3:07 pm

Also, Right clocking on a folding slot in FAHControl gives you a menu where you can Pause that slot without pausing other folding slots on your system.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6345
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Fri Jun 12, 2020 8:10 pm

Neil-B wrote:Just as a thought … If you aren't GPU folding then setting your thread count to 24 should avoid future decomposition issues … CPU scheduling is well managed by OS's as a rule and with FAH at lowest priority you should find running all threads on it doesn't impact other processes noticeably … could up the temps a little bit but probably not much given you are already running most threads … will also get you higher points / speed up the science.


Well I think I'm using 2 out of 24 threads for the 2 GTX 1070, which leaves 22 threads of CPU folding? I believe everything is maxed out at 1,7MPPD, but I'm all ears for more :)
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Fri Jun 12, 2020 8:34 pm

PantherX wrote:Can you please post the section of the log where the CPU Slot requests the Servers to get a WU and it was assigned Project: 14524 (Run 706, Clone 5, Gen 17)? I am keen to see what the client asked for and what the Server provided.


I'm not sure this is what you are looking for but, this is what I have?


Code: Select all
*********************** Log Started 2020-06-11T05:37:46Z ***********************
05:37:46:Trying to access database...
05:37:46:Successfully acquired database lock
05:37:46:Read GPUs.txt
05:37:46:Enabled folding slot 00: PAUSED cpu:22 (by user)
05:37:50:Enabled folding slot 01: PAUSED gpu:0:GP104 [GeForce GTX 1070] 6463 (by user)
05:37:50:Enabled folding slot 02: PAUSED gpu:1:GP104 [GeForce GTX 1070] 6463 (by user)
05:37:50:****************************** FAHClient ******************************
05:37:50:        Version: 7.6.13
05:37:50:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:37:50:      Copyright: 2020 foldingathome.org
05:37:50:       Homepage: https://foldingathome.org/
05:37:50:           Date: Apr 28 2020
05:37:50:           Time: 04:20:16
05:37:50:       Revision: 5a652817f46116b6e135503af97f18e094414e3b
05:37:50:         Branch: master
05:37:50:       Compiler: GNU 8.3.0
05:37:50:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
05:37:50:                 -funroll-loops -fno-pie
05:37:50:       Platform: linux2 4.19.0-5-amd64
05:37:50:           Bits: 64
05:37:50:           Mode: Release
05:37:50:           Args: --child /etc/fahclient/config.xml --run-as fahclient
05:37:50:                 --pid-file=/var/run/fahclient.pid --daemon
05:37:50:         Config: /etc/fahclient/config.xml
05:37:50:******************************** CBang ********************************
05:37:50:           Date: Apr 25 2020
05:37:50:           Time: 00:07:53
05:37:50:       Revision: ea081a3b3b0f4a37c4d0440b4f1bc184197c7797
05:37:50:         Branch: master
05:37:50:       Compiler: GNU 8.3.0
05:37:50:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
05:37:50:                 -funroll-loops -fno-pie -fPIC
05:37:50:       Platform: linux2 4.19.0-5-amd64
05:37:50:           Bits: 64
05:37:50:           Mode: Release
05:37:50:******************************* System ********************************
05:37:50:            CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:37:50:         CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:37:50:           CPUs: 24
05:37:50:         Memory: 15.54GiB
05:37:50:    Free Memory: 14.74GiB
05:37:50:        Threads: POSIX_THREADS
05:37:50:     OS Version: 5.4
05:37:50:    Has Battery: false
05:37:50:     On Battery: false
05:37:50:     UTC Offset: 2
05:37:50:            PID: 1411
05:37:50:            CWD: /var/lib/fahclient
05:37:50:             OS: Linux 5.4.0-37-generic x86_64
05:37:50:        OS Arch: AMD64
05:37:50:           GPUs: 2
05:37:50:          GPU 0: Bus:2 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
05:37:50:          GPU 1: Bus:3 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1070] 6463
05:37:50:  CUDA Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:6.1 Driver:10.2
05:37:50:  CUDA Device 1: Platform:0 Device:1 Bus:3 Slot:0 Compute:6.1 Driver:10.2
05:37:50:OpenCL Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:1.2 Driver:440.64
05:37:50:OpenCL Device 1: Platform:0 Device:1 Bus:3 Slot:0 Compute:1.2 Driver:440.64
05:37:50:******************************* libFAH ********************************
05:37:50:           Date: Apr 15 2020
05:37:50:           Time: 21:43:24
05:37:50:       Revision: 216968bc7025029c841ed6e36e81a03a316890d3
05:37:50:         Branch: master
05:37:50:       Compiler: GNU 8.3.0
05:37:50:        Options: -std=c++11 -ffunction-sections -fdata-sections -O3
05:37:50:                 -funroll-loops -fno-pie
05:37:50:       Platform: linux2 4.19.0-5-amd64
05:37:50:           Bits: 64
05:37:50:           Mode: Release
05:37:50:***********************************************************************
05:37:50:<config>
05:37:50:  <!-- Client Control -->
05:37:50:  <fold-anon v='true'/>
05:37:50:
05:37:50:  <!-- Slot Control -->
05:37:50:  <power v='full'/>
05:37:50:
05:37:50:  <!-- User Information -->
05:37:50:  <passkey v='*****'/>
05:37:50:  <team v='264651'/>
05:37:50:  <user v='AWS3'/>
05:37:50:
05:37:50:  <!-- Folding Slots -->
05:37:50:  <slot id='0' type='CPU'>
05:37:50:    <paused v='true'/>
05:37:50:  </slot>
05:37:50:  <slot id='1' type='GPU'>
05:37:50:    <paused v='true'/>
05:37:50:  </slot>
05:37:50:  <slot id='2' type='GPU'>
05:37:50:    <paused v='true'/>
05:37:50:  </slot>
05:37:50:</config>
05:39:01:FS00:Unpaused
05:39:01:FS01:Unpaused
05:39:01:FS02:Unpaused
05:39:01:WU02:FS00:Starting
05:39:01:WU02:FS00:Removing old file 'work/02/logfile_01-20200611-044650.txt'
05:39:01:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -np 22
05:39:01:WU02:FS00:Started FahCore on PID 2579
05:39:01:WU02:FS00:Core PID:2583
05:39:01:WU02:FS00:FahCore 0xa7 started
05:39:01:WU03:FS01:Starting
05:39:01:WU03:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 03 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
05:39:01:WU03:FS01:Started FahCore on PID 2584
05:39:01:WU03:FS01:Core PID:2588
05:39:01:WU03:FS01:FahCore 0x22 started
05:39:01:WU01:FS02:Starting
05:39:01:WU01:FS02:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 1 -cuda-device 1 -gpu 1
05:39:01:WU01:FS02:Started FahCore on PID 2589
05:39:01:WU01:FS02:Core PID:2593
05:39:01:WU01:FS02:FahCore 0x22 started
05:39:02:WU03:FS01:0x22:*********************** Log Started 2020-06-11T05:39:02Z ***********************
05:39:02:WU03:FS01:0x22:*************************** Core22 Folding@home Core ***************************
05:39:02:WU03:FS01:0x22:       Type: 0x22
05:39:02:WU03:FS01:0x22:       Core: Core22
05:39:02:WU03:FS01:0x22:    Website: https://foldingathome.org/
05:39:02:WU03:FS01:0x22:  Copyright: (c) 2009-2018 foldingathome.org
05:39:02:WU03:FS01:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
05:39:02:WU03:FS01:0x22:             <rafal.wiewiora@choderalab.org>
05:39:02:WU03:FS01:0x22:       Args: -dir 03 -suffix 01 -version 706 -lifeline 2584 -checkpoint 15
05:39:02:WU03:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
05:39:02:WU03:FS01:0x22:             0 -gpu 0
05:39:02:WU03:FS01:0x22:     Config: <none>
05:39:02:WU03:FS01:0x22:************************************ Build *************************************
05:39:02:WU03:FS01:0x22:    Version: 0.0.5
05:39:02:WU03:FS01:0x22:       Date: Apr 22 2020
05:39:02:WU03:FS01:0x22:       Time: 03:57:11
05:39:02:WU03:FS01:0x22: Repository: Git
05:39:02:WU03:FS01:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
05:39:02:WU03:FS01:0x22:     Branch: HEAD
05:39:02:WU03:FS01:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
05:39:02:WU03:FS01:0x22:    Options: -std=c++11 -O3 -funroll-loops
05:39:02:WU03:FS01:0x22:   Platform: linux2 4.19.76-linuxkit
05:39:02:WU03:FS01:0x22:       Bits: 64
05:39:02:WU03:FS01:0x22:       Mode: Release
05:39:02:WU03:FS01:0x22:************************************ System ************************************
05:39:02:WU03:FS01:0x22:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:39:02:WU03:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:39:02:WU03:FS01:0x22:       CPUs: 24
05:39:02:WU03:FS01:0x22:     Memory: 15.54GiB
05:39:02:WU03:FS01:0x22:Free Memory: 13.25GiB
05:39:02:WU03:FS01:0x22:    Threads: POSIX_THREADS
05:39:02:WU03:FS01:0x22: OS Version: 5.4
05:39:02:WU03:FS01:0x22:Has Battery: false
05:39:02:WU03:FS01:0x22: On Battery: false
05:39:02:WU03:FS01:0x22: UTC Offset: 2
05:39:02:WU03:FS01:0x22:        PID: 2588
05:39:02:WU03:FS01:0x22:        CWD: /var/lib/fahclient/work
05:39:02:WU03:FS01:0x22:         OS: Linux 5.4.0-37-generic x86_64
05:39:02:WU03:FS01:0x22:    OS Arch: AMD64
05:39:02:WU03:FS01:0x22:********************************************************************************
05:39:02:WU03:FS01:0x22:Project: 14415 (Run 0, Clone 1950, Gen 102)
05:39:02:WU03:FS01:0x22:Unit: 0x000000a10d5262775e839e59a7745741
05:39:02:WU03:FS01:0x22:Digital signatures verified
05:39:02:WU03:FS01:0x22:Folding@home GPU Core22 Folding@home Core
05:39:02:WU03:FS01:0x22:Version 0.0.5
05:39:02:WU03:FS01:0x22:  Found a checkpoint file
05:39:02:WU02:FS00:0xa7:*********************** Log Started 2020-06-11T05:39:01Z ***********************
05:39:02:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
05:39:02:WU02:FS00:0xa7:       Type: 0xa7
05:39:02:WU02:FS00:0xa7:       Core: Gromacs
05:39:02:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 706 -lifeline 2579 -checkpoint 15 -np
05:39:02:WU02:FS00:0xa7:             22
05:39:02:WU02:FS00:0xa7:************************************ CBang *************************************
05:39:02:WU02:FS00:0xa7:       Date: Nov 5 2019
05:39:02:WU02:FS00:0xa7:       Time: 06:06:57
05:39:02:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
05:39:02:WU02:FS00:0xa7:     Branch: master
05:39:02:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:39:02:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
05:39:02:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:39:02:WU02:FS00:0xa7:       Bits: 64
05:39:02:WU02:FS00:0xa7:       Mode: Release
05:39:02:WU02:FS00:0xa7:************************************ System ************************************
05:39:02:WU02:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:39:02:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:39:02:WU02:FS00:0xa7:       CPUs: 24
05:39:02:WU02:FS00:0xa7:     Memory: 15.54GiB
05:39:02:WU02:FS00:0xa7:Free Memory: 13.25GiB
05:39:02:WU02:FS00:0xa7:    Threads: POSIX_THREADS
05:39:02:WU02:FS00:0xa7: OS Version: 5.4
05:39:02:WU02:FS00:0xa7:Has Battery: false
05:39:02:WU02:FS00:0xa7: On Battery: false
05:39:02:WU02:FS00:0xa7: UTC Offset: 2
05:39:02:WU02:FS00:0xa7:        PID: 2583
05:39:02:WU02:FS00:0xa7:        CWD: /var/lib/fahclient/work
05:39:02:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
05:39:02:WU02:FS00:0xa7:    Version: 0.0.18
05:39:02:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:39:02:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
05:39:02:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
05:39:02:WU02:FS00:0xa7:       Date: Nov 5 2019
05:39:02:WU02:FS00:0xa7:       Time: 06:13:26
05:39:02:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
05:39:02:WU02:FS00:0xa7:     Branch: master
05:39:02:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:39:02:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
05:39:02:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:39:02:WU02:FS00:0xa7:       Bits: 64
05:39:02:WU02:FS00:0xa7:       Mode: Release
05:39:02:WU02:FS00:0xa7:************************************ Build *************************************
05:39:02:WU02:FS00:0xa7:       SIMD: avx_256
05:39:02:WU02:FS00:0xa7:********************************************************************************
05:39:02:WU02:FS00:0xa7:Project: 14524 (Run 706, Clone 5, Gen 17)
05:39:02:WU02:FS00:0xa7:Unit: 0x0000001f80fccb0a5e781bc15bdeaaff
05:39:02:WU02:FS00:0xa7:Reading tar file core.xml
05:39:02:WU02:FS00:0xa7:Reading tar file frame17.tpr
05:39:02:WU02:FS00:0xa7:Digital signatures verified
05:39:02:WU02:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
05:39:02:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpt 15 -nt 21
05:39:02:WU01:FS02:0x22:*********************** Log Started 2020-06-11T05:39:02Z ***********************
05:39:02:WU01:FS02:0x22:*************************** Core22 Folding@home Core ***************************
05:39:02:WU01:FS02:0x22:       Type: 0x22
05:39:02:WU01:FS02:0x22:       Core: Core22
05:39:02:WU01:FS02:0x22:    Website: https://foldingathome.org/
05:39:02:WU01:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
05:39:02:WU01:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
05:39:02:WU01:FS02:0x22:             <rafal.wiewiora@choderalab.org>
05:39:02:WU01:FS02:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 2589 -checkpoint 15
05:39:02:WU01:FS02:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 1 -cuda-device
05:39:02:WU01:FS02:0x22:             1 -gpu 1
05:39:02:WU01:FS02:0x22:     Config: <none>
05:39:02:WU01:FS02:0x22:************************************ Build *************************************
05:39:02:WU01:FS02:0x22:    Version: 0.0.5
05:39:02:WU01:FS02:0x22:       Date: Apr 22 2020
05:39:02:WU01:FS02:0x22:       Time: 03:57:11
05:39:02:WU01:FS02:0x22: Repository: Git
05:39:02:WU01:FS02:0x22:   Revision: 2d69202c898bd9bb3e093f51cd32bf411c2a0388
05:39:02:WU01:FS02:0x22:     Branch: HEAD
05:39:02:WU01:FS02:0x22:   Compiler: GNU 4.8.2 20140120 (Red Hat 4.8.2-15)
05:39:02:WU01:FS02:0x22:    Options: -std=c++11 -O3 -funroll-loops
05:39:02:WU01:FS02:0x22:   Platform: linux2 4.19.76-linuxkit
05:39:02:WU01:FS02:0x22:       Bits: 64
05:39:02:WU01:FS02:0x22:       Mode: Release
05:39:02:WU01:FS02:0x22:************************************ System ************************************
05:39:02:WU01:FS02:0x22:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:39:02:WU01:FS02:0x22:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:39:02:WU01:FS02:0x22:       CPUs: 24
05:39:02:WU01:FS02:0x22:     Memory: 15.54GiB
05:39:02:WU01:FS02:0x22:Free Memory: 13.25GiB
05:39:02:WU01:FS02:0x22:    Threads: POSIX_THREADS
05:39:02:WU01:FS02:0x22: OS Version: 5.4
05:39:02:WU01:FS02:0x22:Has Battery: false
05:39:02:WU01:FS02:0x22: On Battery: false
05:39:02:WU01:FS02:0x22: UTC Offset: 2
05:39:02:WU01:FS02:0x22:        PID: 2593
05:39:02:WU01:FS02:0x22:        CWD: /var/lib/fahclient/work
05:39:02:WU01:FS02:0x22:         OS: Linux 5.4.0-37-generic x86_64
05:39:02:WU01:FS02:0x22:    OS Arch: AMD64
05:39:02:WU01:FS02:0x22:********************************************************************************
05:39:02:WU01:FS02:0x22:Project: 11759 (Run 0, Clone 7670, Gen 5)
05:39:02:WU01:FS02:0x22:Unit: 0x0000001080fccb0a5e6ea235de28d377
05:39:02:WU01:FS02:0x22:Digital signatures verified
05:39:02:WU01:FS02:0x22:Folding@home GPU Core22 Folding@home Core
05:39:02:WU01:FS02:0x22:Version 0.0.5
05:39:02:WU01:FS02:0x22:  Found a checkpoint file
05:39:03:WU02:FS00:0xa7:Steps: first=4250000 total=250000
05:39:03:WU02:FS00:0xa7:ERROR:
05:39:03:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:03:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
05:39:03:WU02:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
05:39:03:WU02:FS00:0xa7:ERROR:
05:39:03:WU02:FS00:0xa7:ERROR:Fatal error:
05:39:03:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:39:03:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:39:03:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
05:39:03:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
05:39:03:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
05:39:03:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:08:WU02:FS00:0xa7:WARNING:Unexpected exit() call
05:39:08:WU02:FS00:0xa7:WARNING:Unexpected exit from science code
05:39:08:WU02:FS00:0xa7:Saving result file ../logfile_01.txt
05:39:08:WU02:FS00:0xa7:Saving result file md.log
05:39:08:WU02:FS00:0xa7:Saving result file science.log
05:39:09:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:39:10:WU02:FS00:Starting
05:39:10:WU02:FS00:Removing old file 'work/02/logfile_01-20200611-044751.txt'
05:39:10:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -np 22
05:39:10:WU02:FS00:Started FahCore on PID 2660
05:39:10:WU02:FS00:Core PID:2664
05:39:10:WU02:FS00:FahCore 0xa7 started
05:39:10:WU02:FS00:0xa7:*********************** Log Started 2020-06-11T05:39:10Z ***********************
05:39:10:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
05:39:10:WU02:FS00:0xa7:       Type: 0xa7
05:39:10:WU02:FS00:0xa7:       Core: Gromacs
05:39:10:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 706 -lifeline 2660 -checkpoint 15 -np
05:39:10:WU02:FS00:0xa7:             22
05:39:10:WU02:FS00:0xa7:************************************ CBang *************************************
05:39:10:WU02:FS00:0xa7:       Date: Nov 5 2019
05:39:10:WU02:FS00:0xa7:       Time: 06:06:57
05:39:10:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
05:39:10:WU02:FS00:0xa7:     Branch: master
05:39:10:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:39:10:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
05:39:10:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:39:10:WU02:FS00:0xa7:       Bits: 64
05:39:10:WU02:FS00:0xa7:       Mode: Release
05:39:10:WU02:FS00:0xa7:************************************ System ************************************
05:39:10:WU02:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:39:10:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:39:10:WU02:FS00:0xa7:       CPUs: 24
05:39:10:WU02:FS00:0xa7:     Memory: 15.54GiB
05:39:10:WU02:FS00:0xa7:Free Memory: 11.78GiB
05:39:10:WU02:FS00:0xa7:    Threads: POSIX_THREADS
05:39:10:WU02:FS00:0xa7: OS Version: 5.4
05:39:10:WU02:FS00:0xa7:Has Battery: false
05:39:10:WU02:FS00:0xa7: On Battery: false
05:39:10:WU02:FS00:0xa7: UTC Offset: 2
05:39:10:WU02:FS00:0xa7:        PID: 2664
05:39:10:WU02:FS00:0xa7:        CWD: /var/lib/fahclient/work
05:39:10:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
05:39:10:WU02:FS00:0xa7:    Version: 0.0.18
05:39:10:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:39:10:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
05:39:10:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
05:39:10:WU02:FS00:0xa7:       Date: Nov 5 2019
05:39:10:WU02:FS00:0xa7:       Time: 06:13:26
05:39:10:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
05:39:10:WU02:FS00:0xa7:     Branch: master
05:39:10:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:39:10:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
05:39:10:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:39:10:WU02:FS00:0xa7:       Bits: 64
05:39:10:WU02:FS00:0xa7:       Mode: Release
05:39:10:WU02:FS00:0xa7:************************************ Build *************************************
05:39:10:WU02:FS00:0xa7:       SIMD: avx_256
05:39:10:WU02:FS00:0xa7:********************************************************************************
05:39:10:WU02:FS00:0xa7:Project: 14524 (Run 706, Clone 5, Gen 17)
05:39:10:WU02:FS00:0xa7:Unit: 0x0000001f80fccb0a5e781bc15bdeaaff
05:39:10:WU02:FS00:0xa7:Reading tar file core.xml
05:39:10:WU02:FS00:0xa7:Reading tar file frame17.tpr
05:39:10:WU02:FS00:0xa7:Digital signatures verified
05:39:10:WU02:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
05:39:10:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpt 15 -nt 21
05:39:10:WU02:FS00:0xa7:Steps: first=4250000 total=250000
05:39:10:WU02:FS00:0xa7:ERROR:
05:39:10:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:10:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
05:39:10:WU02:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
05:39:10:WU02:FS00:0xa7:ERROR:
05:39:10:WU02:FS00:0xa7:ERROR:Fatal error:
05:39:10:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:39:10:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:39:10:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
05:39:10:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
05:39:10:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
05:39:10:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:15:WU02:FS00:0xa7:WARNING:Unexpected exit() call
05:39:15:WU02:FS00:0xa7:WARNING:Unexpected exit from science code
05:39:15:WU02:FS00:0xa7:Saving result file ../logfile_01.txt
05:39:15:WU02:FS00:0xa7:Saving result file md.log
05:39:15:WU02:FS00:0xa7:Saving result file science.log
05:39:16:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:39:27:WU01:FS02:0x22:Completed 100000 out of 1000000 steps (10%)
05:39:27:WU01:FS02:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
05:39:30:WU03:FS01:0x22:Completed 550000 out of 1000000 steps (55%)
05:39:30:WU03:FS01:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
05:39:48:Removing old file 'configs/config-20200606-182556.xml'
05:39:48:Saving configuration to /etc/fahclient/config.xml
05:39:48:<config>
05:39:48:  <!-- Client Control -->
05:39:48:  <fold-anon v='true'/>
05:39:48:
05:39:48:  <!-- Slot Control -->
05:39:48:  <power v='full'/>
05:39:48:
05:39:48:  <!-- User Information -->
05:39:48:  <passkey v='*****'/>
05:39:48:  <team v='264651'/>
05:39:48:  <user v='AWS3'/>
05:39:48:
05:39:48:  <!-- Folding Slots -->
05:39:48:  <slot id='0' type='CPU'/>
05:39:48:  <slot id='1' type='GPU'/>
05:39:48:  <slot id='2' type='GPU'/>
05:39:48:</config>
05:40:10:WU02:FS00:Starting
05:40:10:WU02:FS00:Removing old file 'work/02/logfile_01-20200611-044851.txt'
05:40:10:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -np 22
05:40:10:WU02:FS00:Started FahCore on PID 2898
05:40:10:WU02:FS00:Core PID:2902
05:40:10:WU02:FS00:FahCore 0xa7 started
05:40:10:WU02:FS00:0xa7:*********************** Log Started 2020-06-11T05:40:10Z ***********************
05:40:10:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
05:40:10:WU02:FS00:0xa7:       Type: 0xa7
05:40:10:WU02:FS00:0xa7:       Core: Gromacs
05:40:10:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 706 -lifeline 2898 -checkpoint 15 -np
05:40:10:WU02:FS00:0xa7:             22
05:40:10:WU02:FS00:0xa7:************************************ CBang *************************************
05:40:10:WU02:FS00:0xa7:       Date: Nov 5 2019
05:40:10:WU02:FS00:0xa7:       Time: 06:06:57
05:40:10:WU02:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
05:40:10:WU02:FS00:0xa7:     Branch: master
05:40:10:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:40:10:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
05:40:10:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:40:10:WU02:FS00:0xa7:       Bits: 64
05:40:10:WU02:FS00:0xa7:       Mode: Release
05:40:10:WU02:FS00:0xa7:************************************ System ************************************
05:40:10:WU02:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
05:40:10:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
05:40:10:WU02:FS00:0xa7:       CPUs: 24
05:40:10:WU02:FS00:0xa7:     Memory: 15.54GiB
05:40:10:WU02:FS00:0xa7:Free Memory: 10.72GiB
05:40:10:WU02:FS00:0xa7:    Threads: POSIX_THREADS
05:40:10:WU02:FS00:0xa7: OS Version: 5.4
05:40:10:WU02:FS00:0xa7:Has Battery: false
05:40:10:WU02:FS00:0xa7: On Battery: false
05:40:10:WU02:FS00:0xa7: UTC Offset: 2
05:40:10:WU02:FS00:0xa7:        PID: 2902
05:40:10:WU02:FS00:0xa7:        CWD: /var/lib/fahclient/work
05:40:10:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
05:40:10:WU02:FS00:0xa7:    Version: 0.0.18
05:40:10:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:40:10:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
05:40:10:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
05:40:10:WU02:FS00:0xa7:       Date: Nov 5 2019
05:40:10:WU02:FS00:0xa7:       Time: 06:13:26
05:40:10:WU02:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
05:40:10:WU02:FS00:0xa7:     Branch: master
05:40:10:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
05:40:10:WU02:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
05:40:10:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
05:40:10:WU02:FS00:0xa7:       Bits: 64
05:40:10:WU02:FS00:0xa7:       Mode: Release
05:40:10:WU02:FS00:0xa7:************************************ Build *************************************
05:40:10:WU02:FS00:0xa7:       SIMD: avx_256
05:40:10:WU02:FS00:0xa7:********************************************************************************
05:40:10:WU02:FS00:0xa7:Project: 14524 (Run 706, Clone 5, Gen 17)
05:40:10:WU02:FS00:0xa7:Unit: 0x0000001f80fccb0a5e781bc15bdeaaff
05:40:10:WU02:FS00:0xa7:Reading tar file core.xml
05:40:10:WU02:FS00:0xa7:Reading tar file frame17.tpr
05:40:10:WU02:FS00:0xa7:Digital signatures verified
05:40:10:WU02:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
05:40:10:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpt 15 -nt 21
05:40:10:WU02:FS00:0xa7:Steps: first=4250000 total=250000
05:40:10:WU02:FS00:0xa7:ERROR:
05:40:10:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:40:10:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
05:40:10:WU02:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
05:40:10:WU02:FS00:0xa7:ERROR:
05:40:10:WU02:FS00:0xa7:ERROR:Fatal error:
05:40:10:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:40:10:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:40:10:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
05:40:10:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
05:40:10:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
05:40:10:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
05:40:15:WU02:FS00:0xa7:WARNING:Unexpected exit() call
05:40:15:WU02:FS00:0xa7:WARNING:Unexpected exit from science code
05:40:15:WU02:FS00:0xa7:Saving result file ../logfile_01.txt
05:40:15:WU02:FS00:0xa7:Saving result file md.log
05:40:15:WU02:FS00:0xa7:Saving result file science.log
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
05:40:15:WU02:FS00:0xa7:Caught signal SIGSEGV(11) on PID 2902
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Fri Jun 12, 2020 8:35 pm

Joe_H wrote:Also, Right clocking on a folding slot in FAHControl gives you a menu where you can Pause that slot without pausing other folding slots on your system.

Thanks!
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby JimboPalmer » Fri Jun 12, 2020 9:09 pm

Welcome to Folding@Home!
Reading the log:
At first, we are trying for 22 threads

05:39:01:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 706 -lifeline 1411 -checkpoint 15 -np 22

realizing that 22 is 2 times 11 it down shifts to 21

05:39:02:WU02:FS00:0xa7:Reducing thread count from 22 to 21 to avoid domain decomposition with large prime factor 11
05:39:02:WU02:FS00:0xa7:Calling: mdrun -s frame17.tpr -o frame17.trr -x frame17.xtc -cpt 15 -nt 21

That does not work either

05:39:03:WU02:FS00:0xa7:ERROR:Fatal error:
05:39:03:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:39:03:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:39:03:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition

My recommendation would be to manually set the number of threads (F@H calls them cpus) to 18 (3*3*2, so no large primes)
Then, if you want, you could make another CPU slot with 4 threads.

Find and start fahcontrol.
On this screen to the left is a Configure button, click it
Now you get a screen with a Slots tab, click it
On this white field should be a cpu item, click it and then click edit

By default F@H set the number of CPUs to -1 meaning let the software decide.
You can enter any number from 1 to the number of threads your CPU supports.
If you have GPUs, F@H reserves one CPU per GPU to feed it data across the PCIE bus.
F@H has difficulty with large primes and their multiples number of CPUs.
7 is always large, 5 is sometimes large, and 3 is never large. Try to choose a number that is a multiple of 2 and/or 3.
1, 2, 3, 4, 6, 8, 9, 12, 16, and 18 are good numbers of CPUs to choose.
5. 10. 15, and 20 may work most of the time. Other numbers will bite you

Type the number you want, and click save.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 1910
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby HaloJones » Fri Jun 12, 2020 9:20 pm

Two GTX1070s should alone be hitting 1.6-1.7m. I have two 1070s running off a G4400 with no cpu slot and that is currently at 1755641.
1x Titan X, 5x 1070, 1x 970, 1 x Ryzen 3600

Image
HaloJones
 
Posts: 783
Joined: Thu Jul 24, 2008 11:16 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Sat Jun 13, 2020 4:20 am

Thanks JimboPalmer for a thorough explanation of the F@H's CPU logic and for explaining the log file!
I now have 4 slots in total:
- 1 CPU slot with 18 CPUs (threads) assigned
- 1 CPU slot with 4 CPUs (threads) assigned
- 2 GPU slots with 1 CPU (thread) each

Now my case is solved :)
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Sat Jun 13, 2020 4:30 am

HaloJones wrote:Two GTX1070s should alone be hitting 1.6-1.7m. I have two 1070s running off a G4400 with no cpu slot and that is currently at 1755641.


Impressive rig you've got!
Thanks to the sorting of my CPU/threads/slots-issue my rig is now producing 1.8MPPD :)

Hmmm, however I don't think I'm really at your numbers of 1,75MPPD for the 1070s alone.
Maybe a bit off topic, but what drivers are you using for the GPUs?
I'm running NVIDIA 440.64 and CUDA Version: 10.2 on Ubuntu OS.
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby bruce » Sat Jun 13, 2020 5:01 am

Crunchtimer wrote:- 1 CPU slot with 18 CPUs (threads) assigned
- 1 CPU slot with 4 CPUs (threads) assigned
- 2 GPU slots with 1 CPU (thread) each

That's a very impressive way to allocate your threads. :!:

Nevertheless, I'm a perfectionist and I'd recommend a very slight improvement. Instead of the first two being 18+4, I'd recommend 16+6. There may be rare instances when the assignment sever cannot supply a WU with 18 threads so that slot is forced to operate on less. Moreover adding two more threads to the former 4 gives a bigger improvement to the science than reducing the 18 to 16 costs.
bruce
 
Posts: 19429
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby Crunchtimer » Sat Jun 13, 2020 7:24 am

bruce wrote:
Crunchtimer wrote:- 1 CPU slot with 18 CPUs (threads) assigned
- 1 CPU slot with 4 CPUs (threads) assigned
- 2 GPU slots with 1 CPU (thread) each

That's a very impressive way to allocate your threads. :!:

Nevertheless, I'm a perfectionist and I'd recommend a very slight improvement. Instead of the first two being 18+4, I'd recommend 16+6. There may be rare instances when the assignment sever cannot supply a WU with 18 threads so that slot is forced to operate on less. Moreover adding two more threads to the former 4 gives a bigger improvement to the science than reducing the 18 to 16 costs.


Thanks Bruce!
I was looking for the kind of solution you proposed, but was obviously too tired (or slow) to find that multiple combination of 2s och 3s adding up to 22 (having one slot producing in ~3h and the other in ~12h didn't feel right)
I'd like to think it was due to the fact that it was 4am here i Sweden at the time of writing, and I was only up due to my neighbor having a party. So I guess one should seek to maximize the number of /cpu/threads per slot whilst using only multiples of 2s och 3s:

- 1 CPU slot with 16 CPUs (threads) assigned
- 1 CPU slot with 6 CPUs (threads) assigned
- 2 GPU slots with 1 CPU (thread) each

Thanks guys!
Crunchtimer
 
Posts: 43
Joined: Tue May 05, 2020 6:34 am

Re: SOLVED: CPU stuck at Ready, waiting for FahCore Run

Postby JimboPalmer » Sat Jun 13, 2020 7:58 am

And thank you for telling us what worked and what didn''t. We need feedback and you always gave us feedback.
JimboPalmer
 
Posts: 1910
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA


Return to CPU Projects - beta FAHCores (Currently _a8)

Who is online

Users browsing this forum: No registered users and 1 guest

cron