Page 1 of 1

(Solved) Problem with WU 14524 (551, 5, 1) on Linux client

Posted: Mon Mar 23, 2020 9:58 am
by carlosmorales777
Hello,
I came back to the project a week ago. One of my clients has been folding correctly using CPU or GPU slots.
However, WU 14524 (551, 5, 1) keeps failing to fold. This is a CPU slot (AMD Ryzen 6 3600).
This client has been running at full power without issues. With the current situation I have already tried changing the intensity to see if that makes any difference without success.

I would very much appreciate any help you could provide to sort this issue.

UPDATE: The issue was solved by manually specifying the number of cores/threads dedicated to folding. The config.xml now looks as follows,

Code: Select all

<slot id='0' type='CPU'>
  <cpus v='10'/>
</slot>
The output recorded in logfile_01.txt for this slot follows.

Code: Select all

*********************** Log Started 2020-03-23T09:48:36Z ***********************
************************** Gromacs Folding@home Core ***************************
       Type: 0xa7
       Core: Gromacs
       Args: -dir 00 -suffix 01 -version 705 -lifeline 7664 -checkpoint 15 -np
             11
************************************ CBang *************************************
       Date: Nov 5 2019
       Time: 06:06:57
   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
     Branch: master
   Compiler: GNU 8.3.0
    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
   Platform: linux2 4.19.0-5-amd64
       Bits: 64
       Mode: Release
************************************ System ************************************
        CPU: AMD Ryzen 5 3600 6-Core Processor
     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
       CPUs: 12
     Memory: 15.57GiB
Free Memory: 8.39GiB
    Threads: POSIX_THREADS
 OS Version: 5.5
Has Battery: false
 On Battery: false
 UTC Offset: 0
        PID: 7668
        CWD: /opt/fah/work
******************************** Build - libFAH ********************************
    Version: 0.0.18
     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
  Copyright: 2019 foldingathome.org
   Homepage: https://foldingathome.org/
       Date: Nov 5 2019
       Time: 06:13:26
   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
     Branch: master
   Compiler: GNU 8.3.0
    Options: -std=c++11 -O3 -funroll-loops -fno-pie
   Platform: linux2 4.19.0-5-amd64
       Bits: 64
       Mode: Release
************************************ Build *************************************
       SIMD: avx_256
********************************************************************************
Project: 14524 (Run 551, Clone 5, Gen 1)
Unit: 0x0000000280fccb0a5e781bdd1e069ca6
Reading tar file core.xml
Reading tar file frame1.tpr
Digital signatures verified
Reducing thread count from 11 to 10 to avoid domain decomposition by a prime number > 3
Calling: mdrun -s frame1.tpr -o frame1.trr -x frame1.xtc -cpt 15 -nt 10
Steps: first=250000 total=250000
ERROR:
ERROR:-------------------------------------------------------
ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
ERROR:
ERROR:Fatal error:
ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
ERROR:Look in the log file for details on the domain decomposition
ERROR:For more information and tips for troubleshooting, please check the GROMACS
ERROR:website at http://www.gromacs.org/Documentation/Errors
ERROR:-------------------------------------------------------
WARNING:Unexpected exit() call
WARNING:Unexpected exit from science code
Saving result file ../logfile_01.txt
Saving result file md.log
Saving result file science.log

Re: Problem with WU 14524 (551, 5, 1) on Linux client

Posted: Mon Mar 23, 2020 1:48 pm
by Joe_H
There are some projects that will not run on a multiple of 5 CPU threads, they try to identify them during pre-release testing and limit them from getting assigned to systems the would have that setting. But some get through or don't have the settings in the WU to not use 10 threads.

There s a workaround, if you move the slider to Light the WU should start using 6 threads. Or use FAHControl and Configure the CPU folding slot to use 8 or 9 CPU threads. For this in Configure select Slots, then click on the CPU slot and then Edit. Change the CPU thread setting from -1 to 8 or 9, then OK and Save. The CPU slot should pause and restart with the new setting.

Re: (Solved) Problem with WU 14524 (551, 5, 1) on Linux clie

Posted: Mon Mar 23, 2020 2:02 pm
by carlosmorales777
Thanks, you are right Joe_H, that is what I did and solved the problem.
Then, I manually set the number back to a higher amount and seems to be working correctly. However, it is still a multiple of 5 so it may fail again, but I will know how to proceed in the future.
Thanks again!