There is no domain decomposition for n ranks error

Moderators: Site Moderators, FAHC Science Team

There is no domain decomposition for n ranks error

Postby mariamtriki » Tue Mar 31, 2020 12:48 pm

I am running spare large machines with the config below but they are all facing the same error
Fatal error:
There is no domain decomposition for 72 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm



Code: Select all
*********************** Log Started 2020-03-27T23:46:56Z ***********************
23:46:56:************************* Folding@home Client *************************
23:46:56:    Website: http://folding.stanford.edu/
23:46:56:  Copyright: (c) 2009-2014 Stanford University
23:46:56:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:46:56:       Args: --child --lifeline 2659 /etc/fahclient/config.xml --run-as
23:46:56:             fahclient --pid-file=/var/run/fahclient.pid --daemon
23:46:56:     Config: /etc/fahclient/config.xml
23:46:56:******************************** Build ********************************
23:46:56:    Version: 7.4.4
23:46:56:       Date: Mar 4 2014
23:46:56:       Time: 12:02:38
23:46:56:    SVN Rev: 4130
23:46:56:     Branch: fah/trunk/client
23:46:56:   Compiler: GNU 4.4.7
23:46:56:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
23:46:56:             -fno-unsafe-math-optimizations -msse2
23:46:56:   Platform: linux2 3.2.0-1-amd64
23:46:56:       Bits: 64
23:46:56:       Mode: Release
23:46:56:******************************* System ********************************
23:46:56:        CPU: Intel(R) Xeon(R) CPU @ 2.00GHz
23:46:56:     CPU ID: GenuineIntel Family 6 Model 85 Stepping 3
23:46:56:       CPUs: 96
23:46:56:     Memory: 1.38TiB
23:46:56:Free Memory: 1.38TiB
23:46:56:    Threads: POSIX_THREADS
23:46:56: OS Version: 4.15
23:46:56:Has Battery: false
23:46:56: On Battery: false
23:46:56: UTC Offset: 0
23:46:56:        PID: 2664
23:46:56:        CWD: /var/lib/fahclient
23:46:56:         OS: Linux 4.15.0-1058-gcp x86_64
23:46:56:    OS Arch: AMD64
23:46:56:       GPUs: 0
23:46:56:       CUDA: Not detected
23:46:56:***********************************************************************


Code: Select all
23:49:46:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
23:49:46:WU00:FS00:0xa7:ERROR:
23:49:46:WU00:FS00:0xa7:ERROR:Fatal error:
23:49:46:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 72 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm
23:49:46:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
23:49:46:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
23:49:46:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
23:49:46:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
23:49:46:WU00:FS00:0xa7:ERROR:------------------------------------------------------
mariamtriki
 
Posts: 3
Joined: Tue Mar 31, 2020 12:25 pm

Re: There is no domain decomposition for n ranks error

Postby JimboPalmer » Tue Mar 31, 2020 4:56 pm

Welcome to Folding@Home!

This is a case where your hardware is much better than the researchers expected.

The software does not expect more than 32 cores, and/or the researcher has not enabled more than 32 cores.

96 cores could be a cpu slot of 32, another cpu slot of 32, and still another cpu slot of 32.

I am just a mere Windows user, so I do not know how to start the Advanced Control on Linux, but once it is running, here is the Boilerplate I am offering as advice:

On this screen to the left is a Configure button, click it

Now you get a screen with a Slots tab, click it

On this white field should be a cpu item, click it and then click edit

By default F@H set the number of CPUs to -1 meaning let the software decide.
You can enter any number from 1 to the number of threads your CPU supports. (32 for you)

If you have GPUs, F@H reserves one CPU per GPU to feed it data across the PCIE bus. (you don't)

F@H has difficulty with large primes and their multiples number of CPUs.
7 is always large, 5 is sometimes large, and 3 is never large. Try to choose a number that is a multiple of 2 and/or 3.
2, 3, 4, 6, 8, 9, 12, 16, 18, 24, 27, etc. are good numbers of CPUs to choose.
5. 10. 15, 20 etc may work most of the time. Other numbers will bite you
Type the number you want, and click save.

F@H may have issues with CPU counts over 32, if I had more than 32 CPUs, I would make multiple cpu slots. There is an add button on the Slots screen. (add 2 more cpu slots and set them to 32 each)

With the twin caveats that I only have 4 cpus and run Windows, that should be right.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 2041
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: There is no domain decomposition for n ranks error

Postby Joe_H » Tue Mar 31, 2020 9:14 pm

Directions given are all basically correct. Some projects will use more than 32 cores, but most of the larger simulations have gone to GPU folding.

As you are using the 7.4.4 version of the client, some of the automatic assigning of CPU WUs that will fold on less than the number of threads requested will not happen. If there are no WUs that will assign to that number, the request fails. The current 7.5.1 client will negotiate with the servers and take smaller WUs than requested.

If you have the Project number for the WU that failed with that message, you can post that information here and the servers will be updated to limit assignment to not include that number of threads.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6608
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: There is no domain decomposition for n ranks error

Postby mariamtriki » Wed Apr 01, 2020 5:21 pm

I updated the machines to use V 7.5.1 and I am still seeing the same error. also is there a way I can update these parameters in the config instead of using the graphic interface

where can I find the project # ?
mariamtriki
 
Posts: 3
Joined: Tue Mar 31, 2020 12:25 pm

Re: There is no domain decomposition for n ranks error

Postby mariamtriki » Wed Apr 01, 2020 7:14 pm

updated config file to the format below but the client fails every time I try to start it

Code: Select all
<config>
  <!-- Client Control -->
  <fold-anon v='true'/>

  <!-- Folding Slot Configuration -->
  <gpu v='false'/>

  <!-- Slot Control -->
  <power v='full'/>

  <!-- User Information -->
  <passkey v='5f717418ea7f6bcc5f717418ea7f6bcc'/>
  <team v='229957'/>
  <user v='Multi_Cloud_Galaxy'/>

  <!-- Folding Slots -->
  <slot id='0' type='CPU'>
   <cpus v='32'/>
  </slot>
  <slot id='1' type='CPU'>
   <cpus v='32'/>
  </slot>
  <slot id='2' type='CPU'>
   <cpus v='32'/>
  </slot>
</config>




mariamtriki
 
Posts: 3
Joined: Tue Mar 31, 2020 12:25 pm

Re: There is no domain decomposition for n ranks error

Postby bruce » Wed Apr 01, 2020 7:34 pm

The projects that have WUS available will vary and you need a crystal ball to get it right with a high probability. I would probably split a couple of the 32cpu slots into 16 CPU slots but I can't give you a good reason for doing that.

Yes, you can update the parameters with an editor or with the telnet intervace. Be carfeful with the editor, though, because even though xml is called a man-readable format, it's being used in a machine-readable environment. Unexpected variations from the strict format supported by FAHClient can result in sending you off to debug mission what you changed and why FAHClient balked

You can telnet <hostname> 36330 into the client and make chances using the interpreter that' more or less self documenting with help files.
bruce
 
Posts: 20009
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: There is no domain decomposition for n ranks error

Postby Joe_H » Wed Apr 01, 2020 10:05 pm

mariamtriki wrote:where can I find the project # ?

The project number for the one that failed should be in one of theology files. The current one, log.txt, is in the directory used by F@h for work and data files. By default the client also keeps the 16 most recent login a directory named logs in the same location.
Joe_H
Site Admin
 
Posts: 6608
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: There is no domain decomposition for n ranks error

Postby JimboPalmer » Wed Apr 01, 2020 10:29 pm

Joe_H wrote:The project number for the one that failed should be in one of theology files.


Auto correct gone berserk? "One of the log files"? It happens to all of us!
JimboPalmer
 
Posts: 2041
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: There is no domain decomposition for n ranks error

Postby Joe_H » Wed Apr 01, 2020 11:49 pm

JimboPalmer wrote:Auto correct gone berserk? "One of the log files"? It happens to all of us!

Yes, sometimes I consider turning it off, but most of the time it at least tags my thumb fingered typing so I see the error and fix it.
Joe_H
Site Admin
 
Posts: 6608
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA


Return to V7.4.4 Public Release Windows/Linux/MacOS X (deprecated)

Who is online

Users browsing this forum: No registered users and 2 guests

cron