Checking on my machine, I noticed it was continually trying to start the following work unit, but failing to do so, stuck in a loop repeatedly. I pasted the logs below. The CPU is a Ryzen 5 1600, also running a GPU, the client automatically configures it to 11 threads, 12 minus 1 for the GPU (i did not set how many threads manually it is -1 to let client decide automatically) and it always reduces to 10 apparently. However this log seems to show that 10 threads is incompatible with this project.
Manually changing the thread count to 8 makes this WU run fine, I just wanted to report that it was assigned to my system that was configured for an amount of threads it is apparently incompatible with
At the time the WU was downloaded, your slot was configured for 11 threads. That's a number for which FAHCore_a7 has troubles because it's prime. The client recognizes this problem and issues the message Reducing thread count from 11 to 10 to avoid domain decomposition by a prime number > 3 and reduces the number to 10 threads. Unfortunately, 10 is also a "bad" number of threads, so I would expect to see a second message saying Reducing thread count from 10 to 9 to avoid domain decomposition by a prime number > 3. That didn't happen and I'm not sure why not. Apparently that's a bug.
Nevertheless, you can avoid this problem by manually reducing the number of threads allocated to that slot using FAHControl. If you really want to use all of your CPU threads, you can add a second CPU slot with 2 or more threads (keeping the total at 11).
I'm also concerned about all the other people who are encountering the same problem so I'll open a ticket and see if somebody will fix it for them.
22:39:20:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
22:39:20:WU00:FS00:0xa7:ERROR:Source code file: C:\build\fah\core-a7-avx-release\windows-10-64bit-core-a7-avx-release\gromacs-core\build\gromacs\src\gromacs\mdlib\domdec.c, line: 6902
22:39:20:WU00:FS00:0xa7:ERROR:
22:39:20:WU00:FS00:0xa7:ERROR:Fatal error:
22:39:20:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
I run 21 cores, it normally gets decomposed to 16 and 5 PME.
I've not seen any project fail with 2^4 yet
It gets tricky when GROMACS decides to break up 21 into 5 pme + 16 pp. A different project might break it up into 6 pme + 15 pp and the results might be different.
bruce wrote:It gets tricky when GROMACS decides to break up 21 into 5 pme + 16 pp. A different project might break it up into 6 pme + 15 pp and the results might be different.
I thought this was the normal allocation though?
This in the first project that didn't want to work with 21 cores.
The allocation of PP versus PME threads appears to depend on the dimensions of the bounding box from the analysis that was posted here a few weeks ago. It is related to the minimum thickness of a "slice" as I understand it.
For example two different WUs from projects might both have a volume that is 60 cubic units (one unit being length longer that the minimum). One WU could be in a bounding box that is 3x4x5 and the other 2x5x6. The decompositions would be different, and that might lead to a different number of PME threads which are used to handle interactions between adjacent parts.
iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce wrote:It gets tricky when GROMACS decides to break up 21 into 5 pme + 16 pp. A different project might break it up into 6 pme + 15 pp and the results might be different.
I thought this was the normal allocation though?
This in the first project that didn't want to work with 21 cores.
Only the smallest class of projects won't work with 21. 14524 is a 3x3x3. The 16 threads used for PP needs at least one of those numbers to be a 4 so that it can do 4x4x1 or 4x2x2.
21 will also be used as 3 PME + 18 PP and 9 PME + 12 PP. The number is based on the estimated amount of time that will be spent doing PME work for that specific work unit.