no domain decomposition for 25 ranks

Moderators: Site Moderators, FAHC Science Team

Post Reply
arvidn
Posts: 3
Joined: Fri Apr 03, 2020 11:33 am

no domain decomposition for 25 ranks

Post by arvidn »

The CPU WU I was assigned keeps failing over and over again, with this error:

Code: Select all

11:37:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
11:37:13:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
11:37:13:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
11:37:13:WU00:FS00:0xa7:ERROR:
11:37:13:WU00:FS00:0xa7:ERROR:Fatal error:
11:37:13:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 25 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm
11:37:13:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
11:37:13:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
11:37:13:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
11:37:13:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
11:37:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
11:37:17:WU00:FS00:0xa7:WARNING:Unexpected exit() call
11:37:17:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
There is no indication that this error message was uploaded to any server, but I also can't imagine I'm the target audience for it (because I don't know what it means).

The complete startup log for my CPU worker is:

Code: Select all

11:37:13:WU00:FS00:0xa7:*********************** Log Started 2020-04-03T11:37:12Z ***********************
11:37:13:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
11:37:13:WU00:FS00:0xa7:       Type: 0xa7
11:37:13:WU00:FS00:0xa7:       Core: Gromacs
11:37:13:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 24225 -checkpoint 15 -np
11:37:13:WU00:FS00:0xa7:             31
11:37:13:WU00:FS00:0xa7:************************************ CBang *************************************
11:37:13:WU00:FS00:0xa7:       Date: Nov 5 2019
11:37:13:WU00:FS00:0xa7:       Time: 06:06:57
11:37:13:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
11:37:13:WU00:FS00:0xa7:     Branch: master
11:37:13:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
11:37:13:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
11:37:13:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
11:37:13:WU00:FS00:0xa7:       Bits: 64
11:37:13:WU00:FS00:0xa7:       Mode: Release
11:37:13:WU00:FS00:0xa7:************************************ System ************************************
11:37:13:WU00:FS00:0xa7:        CPU: AMD Ryzen Threadripper 1950X 16-Core Processor
11:37:13:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
11:37:13:WU00:FS00:0xa7:       CPUs: 32
11:37:13:WU00:FS00:0xa7:     Memory: 78.57GiB
11:37:13:WU00:FS00:0xa7:Free Memory: 54.19GiB
11:37:13:WU00:FS00:0xa7:    Threads: POSIX_THREADS
11:37:13:WU00:FS00:0xa7: OS Version: 5.3
11:37:13:WU00:FS00:0xa7:Has Battery: false
11:37:13:WU00:FS00:0xa7: On Battery: false
11:37:13:WU00:FS00:0xa7: UTC Offset: 2
11:37:13:WU00:FS00:0xa7:        PID: 24229
11:37:13:WU00:FS00:0xa7:        CWD: /home/arvid/work
11:37:13:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
11:37:13:WU00:FS00:0xa7:    Version: 0.0.18
11:37:13:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
11:37:13:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
11:37:13:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
11:37:13:WU00:FS00:0xa7:       Date: Nov 5 2019
11:37:13:WU00:FS00:0xa7:       Time: 06:13:26
11:37:13:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
11:37:13:WU00:FS00:0xa7:     Branch: master
11:37:13:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
11:37:13:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
11:37:13:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
11:37:13:WU00:FS00:0xa7:       Bits: 64
11:37:13:WU00:FS00:0xa7:       Mode: Release
11:37:13:WU00:FS00:0xa7:************************************ Build *************************************
11:37:13:WU00:FS00:0xa7:       SIMD: avx_256
11:37:13:WU00:FS00:0xa7:********************************************************************************
11:37:13:WU00:FS00:0xa7:Project: 14576 (Run 0, Clone 1499, Gen 6)
11:37:13:WU00:FS00:0xa7:Unit: 0x0000000e287234c95e7922369413e36c
11:37:13:WU00:FS00:0xa7:Reading tar file core.xml
11:37:13:WU00:FS00:0xa7:Reading tar file frame6.tpr
11:37:13:WU00:FS00:0xa7:Digital signatures verified
11:37:13:WU00:FS00:0xa7:Reducing thread count from 31 to 30 to avoid domain decomposition by a prime number > 3
11:37:13:WU00:FS00:0xa7:Calling: mdrun -s frame6.tpr -o frame6.trr -x frame6.xtc -cpt 15 -nt 30
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: no domain decomposition for 25 ranks

Post by Neil-B »

The issue is the 31 core count on the slot (see large prime stuff below) - the client has tried to compensate and reduced it to 30cores but this is still divisible by 5 which I believe has caused the issue … Modifying the core count to 27 would avoid this … or (if you are not GPU folding) up to 32 … you can configure the slot using the advanced control - change what I guess will be -1 against the cpu count to the number you choose.

borrowed from another post:

F@H has difficulty with large primes and their multiples number of CPUs.
7 is always large, 5 is sometimes large, and 3 is never large. Try to choose a number that is a multiple of 2 and/or 3.
2, 3, 4, 6, 8, 9, 12, 16, 18, 24, 27, etc. are good numbers of CPUs to choose.
5. 10. 15, 20 etc may work most of the time. Other numbers will bite you
Type the number you want, and click save.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: no domain decomposition for 25 ranks

Post by Joe_H »

First, is the particular WU still on your system. Project: 14576 (Run 0, Clone 1499, Gen 6)? The log sections are enough to see what the basic problem was, but not whether the WU exited with failure and uploaded. Sometimes the error is not detected as severe enough to exit completely and the client retries multiple times.

It looks like this project should have been set with a maximum number of CPU threads, it is too small to work with a decomposition over 30 threads or the 25 shown in the first section of the log you posted. I will notify the person running this project.

If this WU is still attempting to run, I can give you directions on how to remove it. Or if you pause the processing and set the number of threads down to a lower number, 12 probably, it should work.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: no domain decomposition for 25 ranks

Post by Neil-B »

Damn … was looking at the main header and forgot to check the error itself - sorry
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
arvidn
Posts: 3
Joined: Fri Apr 03, 2020 11:33 am

Re: no domain decomposition for 25 ranks

Post by arvidn »

There is no "advanced control" in the web UI (as far as I can tell).
I cannot find it documented anywhere how to change the number of CPUs. I suspect it involves editing `/etc/fahclient/config.xml`, but that's about how far I've come.
FAHControl does not work on my system (Ubuntu 19.10), it fails with "No module named fah"
arvidn
Posts: 3
Joined: Fri Apr 03, 2020 11:33 am

Re: no domain decomposition for 25 ranks

Post by arvidn »

the full log is:

Code: Select all

14:13:15:WU00:FS00:Starting
14:13:15:WU00:FS00:Removing old file './work/00/logfile_01-20200403-134115.txt'
14:13:15:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/arvid/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 9819 -checkpoint 15 -np 31
14:13:15:WU00:FS00:Started FahCore on PID 45849
14:13:15:WU00:FS00:Core PID:45853
14:13:15:WU00:FS00:FahCore 0xa7 started
14:13:16:WU00:FS00:0xa7:*********************** Log Started 2020-04-03T14:13:15Z ***********************
14:13:16:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
14:13:16:WU00:FS00:0xa7:       Type: 0xa7
14:13:16:WU00:FS00:0xa7:       Core: Gromacs
14:13:16:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 45849 -checkpoint 15 -np
14:13:16:WU00:FS00:0xa7:             31
14:13:16:WU00:FS00:0xa7:************************************ CBang *************************************
14:13:16:WU00:FS00:0xa7:       Date: Nov 5 2019
14:13:16:WU00:FS00:0xa7:       Time: 06:06:57
14:13:16:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
14:13:16:WU00:FS00:0xa7:     Branch: master
14:13:16:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
14:13:16:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
14:13:16:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
14:13:16:WU00:FS00:0xa7:       Bits: 64
14:13:16:WU00:FS00:0xa7:       Mode: Release
14:13:16:WU00:FS00:0xa7:************************************ System ************************************
14:13:16:WU00:FS00:0xa7:        CPU: AMD Ryzen Threadripper 1950X 16-Core Processor
14:13:16:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 1 Stepping 1
14:13:16:WU00:FS00:0xa7:       CPUs: 32
14:13:16:WU00:FS00:0xa7:     Memory: 78.57GiB
14:13:16:WU00:FS00:0xa7:Free Memory: 54.30GiB
14:13:16:WU00:FS00:0xa7:    Threads: POSIX_THREADS
14:13:16:WU00:FS00:0xa7: OS Version: 5.3
14:13:16:WU00:FS00:0xa7:Has Battery: false
14:13:16:WU00:FS00:0xa7: On Battery: false
14:13:16:WU00:FS00:0xa7: UTC Offset: 2
14:13:16:WU00:FS00:0xa7:        PID: 45853
14:13:16:WU00:FS00:0xa7:        CWD: /home/arvid/work
14:13:16:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
14:13:16:WU00:FS00:0xa7:    Version: 0.0.18
14:13:16:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:13:16:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
14:13:16:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
14:13:16:WU00:FS00:0xa7:       Date: Nov 5 2019
14:13:16:WU00:FS00:0xa7:       Time: 06:13:26
14:13:16:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
14:13:16:WU00:FS00:0xa7:     Branch: master
14:13:16:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
14:13:16:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
14:13:16:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
14:13:16:WU00:FS00:0xa7:       Bits: 64
14:13:16:WU00:FS00:0xa7:       Mode: Release
14:13:16:WU00:FS00:0xa7:************************************ Build *************************************
14:13:16:WU00:FS00:0xa7:       SIMD: avx_256
14:13:16:WU00:FS00:0xa7:********************************************************************************
14:13:16:WU00:FS00:0xa7:Project: 14576 (Run 0, Clone 1499, Gen 6)
14:13:16:WU00:FS00:0xa7:Unit: 0x0000000e287234c95e7922369413e36c
14:13:16:WU00:FS00:0xa7:Reading tar file core.xml
14:13:16:WU00:FS00:0xa7:Reading tar file frame6.tpr
14:13:16:WU00:FS00:0xa7:Digital signatures verified
14:13:16:WU00:FS00:0xa7:Reducing thread count from 31 to 30 to avoid domain decomposition by a prime number > 3
14:13:16:WU00:FS00:0xa7:Calling: mdrun -s frame6.tpr -o frame6.trr -x frame6.xtc -cpt 15 -nt 30
14:13:16:WU00:FS00:0xa7:Steps: first=3000000 total=500000
14:13:16:WU00:FS00:0xa7:ERROR:
14:13:16:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
14:13:16:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
14:13:16:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
14:13:16:WU00:FS00:0xa7:ERROR:
14:13:16:WU00:FS00:0xa7:ERROR:Fatal error:
14:13:16:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 25 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm
14:13:16:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
14:13:16:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
14:13:16:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
14:13:16:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
14:13:16:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
14:13:21:WU00:FS00:0xa7:WARNING:Unexpected exit() call
14:13:21:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
14:13:21:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
14:13:21:WU00:FS00:0xa7:Saving result file md.log
14:13:21:WU00:FS00:0xa7:Saving result file science.log
14:13:21:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
science.log seems to contain mostly the same error message.
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: no domain decomposition for 25 ranks

Post by Joe_H »

Advanced Control is another name for FAHControl, separate from the Web Control. If you are on one of the Linux versions where you have not loaded FAHControl because of dependencies, Try moving the slider to Light, that will halve the number of CPU threads this will run on.

Otherwise it would take editing the config.xml file and restarting.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
TinyTRexArmz
Posts: 3
Joined: Tue Mar 24, 2020 10:43 pm

Re: no domain decomposition for 25 ranks

Post by TinyTRexArmz »

My number cruncher is doing the same thing for the same project number.
Here's one loop of my full log:

Code: Select all

01:40:40:WU00:FS00:0xa7:*********************** Log Started 2020-04-07T01:40:40Z ***********************
01:40:40:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
01:40:40:WU00:FS00:0xa7:       Type: 0xa7
01:40:40:WU00:FS00:0xa7:       Core: Gromacs
01:40:40:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 28772 -checkpoint 15 -np
01:40:40:WU00:FS00:0xa7:             48
01:40:40:WU00:FS00:0xa7:************************************ CBang *************************************
01:40:40:WU00:FS00:0xa7:       Date: Nov 5 2019
01:40:40:WU00:FS00:0xa7:       Time: 06:06:57
01:40:40:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
01:40:40:WU00:FS00:0xa7:     Branch: master
01:40:40:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
01:40:40:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
01:40:40:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
01:40:40:WU00:FS00:0xa7:       Bits: 64
01:40:40:WU00:FS00:0xa7:       Mode: Release
01:40:40:WU00:FS00:0xa7:************************************ System ************************************
01:40:40:WU00:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
01:40:40:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 0
01:40:40:WU00:FS00:0xa7:       CPUs: 48
01:40:40:WU00:FS00:0xa7:     Memory: 3.84GiB
01:40:40:WU00:FS00:0xa7:Free Memory: 1.75GiB
01:40:40:WU00:FS00:0xa7:    Threads: POSIX_THREADS
01:40:40:WU00:FS00:0xa7: OS Version: 5.3
01:40:40:WU00:FS00:0xa7:Has Battery: false
01:40:40:WU00:FS00:0xa7: On Battery: false
01:40:40:WU00:FS00:0xa7: UTC Offset: -5
01:40:40:WU00:FS00:0xa7:        PID: 28776
01:40:40:WU00:FS00:0xa7:        CWD: /var/lib/fahclient/work
01:40:40:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
01:40:40:WU00:FS00:0xa7:    Version: 0.0.18
01:40:40:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:40:40:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
01:40:40:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
01:40:40:WU00:FS00:0xa7:       Date: Nov 5 2019
01:40:40:WU00:FS00:0xa7:       Time: 06:13:26
01:40:40:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
01:40:40:WU00:FS00:0xa7:     Branch: master
01:40:40:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
01:40:40:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
01:40:40:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
01:40:40:WU00:FS00:0xa7:       Bits: 64
01:40:40:WU00:FS00:0xa7:       Mode: Release
01:40:40:WU00:FS00:0xa7:************************************ Build *************************************
01:40:40:WU00:FS00:0xa7:       SIMD: avx_256
01:40:40:WU00:FS00:0xa7:********************************************************************************
01:40:40:WU00:FS00:0xa7:Project: 14576 (Run 0, Clone 4639, Gen 22)
01:40:40:WU00:FS00:0xa7:Unit: 0x0000001f287234c95e7b86777fe91793
01:40:40:WU00:FS00:0xa7:Reading tar file core.xml
01:40:40:WU00:FS00:0xa7:Reading tar file frame22.tpr
01:40:40:WU00:FS00:0xa7:Digital signatures verified
01:40:40:WU00:FS00:0xa7:Calling: mdrun -s frame22.tpr -o frame22.trr -x frame22.xtc -cpt 15 -nt 48
01:40:40:WU00:FS00:0xa7:Steps: first=11000000 total=500000
01:40:40:WU00:FS00:0xa7:ERROR:
01:40:40:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
01:40:40:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
01:40:40:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
01:40:40:WU00:FS00:0xa7:ERROR:
01:40:40:WU00:FS00:0xa7:ERROR:Fatal error:
01:40:40:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 40 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm
01:40:40:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
01:40:40:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
01:40:40:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
01:40:40:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
01:40:40:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
01:40:45:WU00:FS00:0xa7:WARNING:Unexpected exit() call
01:40:45:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
01:40:45:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
01:40:45:WU00:FS00:0xa7:Saving result file md.log
01:40:45:WU00:FS00:0xa7:Saving result file science.log
01:40:45:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: no domain decomposition for 25 ranks

Post by bruce »

arvidn wrote:There is no "advanced control" in the web UI (as far as I can tell).
I cannot find it documented anywhere how to change the number of CPUs. I suspect it involves editing `/etc/fahclient/config.xml`, but that's about how far I've come.
FAHControl does not work on my system (Ubuntu 19.10), it fails with "No module named fah"
FAHControl (aka Advanced Control) is installed separately or if you have a Windows machine on your LAN, you can establish a link between the two machines.

And yes, otherwise you can edit /etc/fahclient/config.xml
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: no domain decomposition for 25 ranks

Post by foldy »

As permanent workaround I divided the CPU threads into several CPU slots. So for a 32 CPU thread machine I created 2 cpu slots with 14 threads each. And for a 96 CPU threads machine I created 3 cpu slots with 28 threads each.
TinyTRexArmz
Posts: 3
Joined: Tue Mar 24, 2020 10:43 pm

Re: no domain decomposition for 25 ranks

Post by TinyTRexArmz »

Lowering the thread count and increasing the slots did the trick for me. Thanks
Post Reply