Bad WU, Project: 14524 (Run 168, Clone 0, Gen 8)

Moderators: Site Moderators, FAHC Science Team

Post Reply
HannuN
Posts: 6
Joined: Sun Mar 29, 2020 5:41 pm

Bad WU, Project: 14524 (Run 168, Clone 0, Gen 8)

Post by HannuN »

I hope all required info is here.
This is the single WU that has caused trouble so far.
Using a i7-4770K (and a Asus made GTX 2060). No overclocking, a very stable system.

Code: Select all

17:19:13:WU00:FS00:0xa7:Project: 14524 (Run 168, Clone 0, Gen 8)
17:19:13:WU00:FS00:0xa7:Unit: 0x0000001380fccb0a5e459bb50d6a242c
17:19:13:WU00:FS00:0xa7:Reading tar file core.xml
17:19:13:WU00:FS00:0xa7:Reading tar file frame8.tpr
17:19:13:WU00:FS00:0xa7:Digital signatures verified
17:19:13:WU00:FS00:0xa7:Reducing thread count from 11 to 10 to avoid domain decomposition by a prime number > 3
17:19:13:WU00:FS00:0xa7:Calling: mdrun -s frame8.tpr -o frame8.trr -x frame8.xtc -cpt 15 -nt 10
17:19:13:WU00:FS00:0xa7:Steps: first=2000000 total=250000
17:19:13:WU00:FS00:0xa7:ERROR:
17:19:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
17:19:13:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
17:19:13:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
17:19:13:WU00:FS00:0xa7:ERROR:
17:19:13:WU00:FS00:0xa7:ERROR:Fatal error:
17:19:13:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
17:19:13:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
17:19:13:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
17:19:13:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
17:19:13:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
17:19:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
17:19:18:WU00:FS00:0xa7:WARNING:Unexpected exit() call
17:19:18:WU00:FS00:0xa7:WARNING:Unexpected exit from science code
17:19:18:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
17:19:18:WU00:FS00:0xa7:Saving result file md.log
17:19:18:WU00:FS00:0xa7:Saving result file science.log
17:19:18:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
/Hannu
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Bad WU, Project: 14524 (Run 168, Clone 0, Gen 8)

Post by davidcoton »

Thank you for your report, info passed to the project team.
Image
HannuN
Posts: 6
Joined: Sun Mar 29, 2020 5:41 pm

Re: Bad WU, Project: 14524 (Run 168, Clone 0, Gen 8)

Post by HannuN »

One more similar, same project.
---

Code: Select all

20:01:18:WU02:FS00:0xa7:Project: 14524 (Run 298, Clone 3, Gen 3)
20:01:18:WU02:FS00:0xa7:Unit: 0x0000000680fccb0a5e781c150f2a5daf
20:01:18:WU02:FS00:0xa7:Reading tar file core.xml
20:01:18:WU02:FS00:0xa7:Reading tar file frame3.tpr
20:01:18:WU02:FS00:0xa7:Digital signatures verified
20:01:18:WU02:FS00:0xa7:Reducing thread count from 11 to 10 to avoid domain decomposition by a prime number > 3
20:01:18:WU02:FS00:0xa7:Calling: mdrun -s frame3.tpr -o frame3.trr -x frame3.xtc -cpt 15 -nt 10
20:01:18:WU02:FS00:0xa7:Steps: first=750000 total=250000
20:01:18:WU02:FS00:0xa7:ERROR:
20:01:18:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
20:01:18:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
20:01:18:WU02:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
20:01:18:WU02:FS00:0xa7:ERROR:
20:01:18:WU02:FS00:0xa7:ERROR:Fatal error:
20:01:18:WU02:FS00:0xa7:ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
20:01:18:WU02:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
20:01:18:WU02:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
20:01:18:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
20:01:18:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
20:01:18:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
20:01:23:WU02:FS00:0xa7:WARNING:Unexpected exit() call
20:01:23:WU02:FS00:0xa7:WARNING:Unexpected exit from science code
20:01:23:WU02:FS00:0xa7:Saving result file ../logfile_01.txt
20:01:23:WU02:FS00:0xa7:Saving result file md.log
20:01:23:WU02:FS00:0xa7:Saving result file science.log
20:01:23:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Post Reply