Core a7 failing overnight with gromacs error

Moderators: Site Moderators, FAHC Science Team

Post Reply
Forcinghavok
Posts: 130
Joined: Wed Feb 06, 2013 4:46 pm

0xa7 fah error

Post by Forcinghavok »

Code: Select all

*********************** Log Started 2016-11-14T02:31:26Z ***********************
02:31:26:************************* Folding@home Client *************************
02:31:26:      Website: http://folding.stanford.edu/
02:31:26:    Copyright: (c) 2009-2014 Stanford University
02:31:26:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
02:31:26:         Args: 
02:31:26:       Config: C:/Users/erik/AppData/Roaming/FAHClient/config.xml
02:31:26:******************************** Build ********************************
02:31:26:      Version: 7.4.4
02:31:26:         Date: Mar 4 2014
02:31:26:         Time: 20:26:54
02:31:26:      SVN Rev: 4130
02:31:26:       Branch: fah/trunk/client
02:31:26:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
02:31:26:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
02:31:26:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
02:31:26:     Platform: win32 XP
02:31:26:         Bits: 32
02:31:26:         Mode: Release
02:31:26:******************************* System ********************************
02:31:26:          CPU: AMD FX(tm)-8350 Eight-Core Processor
02:31:26:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
02:31:26:         CPUs: 8
02:31:26:       Memory: 15.90GiB
02:31:26:  Free Memory: 14.14GiB
02:31:26:      Threads: WINDOWS_THREADS
02:31:26:   OS Version: 6.1
02:31:26:  Has Battery: false
02:31:26:   On Battery: false
02:31:26:   UTC Offset: -8
02:31:26:          PID: 2700
02:31:26:          CWD: C:/Users/erik/AppData/Roaming/FAHClient
02:31:26:           OS: Windows 7 Ultimate
02:31:26:      OS Arch: AMD64
02:31:26:         GPUs: 1
02:31:26:        GPU 0: ATI:5 Ellesmere XT [Radeon RX 480]
02:31:26:         CUDA: Not detected
02:31:26:Win32 Service: false
02:31:26:***********************************************************************
02:31:26:<config>
02:31:26:  <!-- Folding Core -->
02:31:26:  <checkpoint v='3'/>
02:31:26:
02:31:26:  <!-- Network -->
02:31:26:  <proxy v=':8080'/>
02:31:26:
02:31:26:  <!-- User Information -->
02:31:26:  <team v='2092'/>
02:31:26:  <user v='forcinghavok'/>
02:31:26:
02:31:26:  <!-- Folding Slots -->
02:31:26:  <slot id='0' type='CPU'/>
02:31:26:  <slot id='1' type='GPU'/>
02:31:26:</config>
02:31:26:Trying to access database...
02:31:26:Successfully acquired database lock
02:31:26:Enabled folding slot 00: READY cpu:6
02:31:26:Enabled folding slot 01: READY gpu:0:Ellesmere XT [Radeon RX 480]
<snip>
03:20:21:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:6 from 171.64.65.41
03:20:21:WU01:FS00:Connecting to 171.64.65.41:8080
03:20:21:WU01:FS00:Downloading 21.03MiB
03:20:27:WU01:FS00:Download 45.48%
03:20:33:WU01:FS00:Download 98.99%
03:20:33:WU01:FS00:Download complete
03:20:33:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11920 run:872 clone:5 gen:24 core:0xa7 unit:0x0000001cab4041295809c4d73c704df8
03:22:17:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/erik/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 704 -lifeline 2700 -checkpoint 3 -np 6
03:22:17:WU01:FS00:Started FahCore on PID 2236
03:22:17:WU01:FS00:Core PID:4996
03:22:17:WU01:FS00:FahCore 0xa7 started
03:22:18:WU01:FS00:0xa7:*********************** Log Started 2016-11-14T03:22:17Z ***********************
03:22:18:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
03:22:18:WU01:FS00:0xa7:       Type: 0xa7
03:22:18:WU01:FS00:0xa7:       Core: Gromacs
03:22:18:WU01:FS00:0xa7:    Website: http://folding.stanford.edu/
03:22:18:WU01:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
03:22:18:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:22:18:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 704 -lifeline 2236 -checkpoint 3 -np 6
03:22:18:WU01:FS00:0xa7:     Config: <none>
03:22:18:WU01:FS00:0xa7:************************************ Build *************************************
03:22:18:WU01:FS00:0xa7:    Version: 0.0.11
03:22:18:WU01:FS00:0xa7:       Date: Sep 21 2016
03:22:18:WU01:FS00:0xa7:       Time: 01:43:48
03:22:18:WU01:FS00:0xa7: Repository: Git
03:22:18:WU01:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
03:22:18:WU01:FS00:0xa7:     Branch: master
03:22:18:WU01:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
03:22:18:WU01:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
03:22:18:WU01:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
03:22:18:WU01:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
03:22:18:WU01:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
03:22:18:WU01:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
03:22:18:WU01:FS00:0xa7:       Bits: 64
03:22:18:WU01:FS00:0xa7:       Mode: Release
03:22:18:WU01:FS00:0xa7:       SIMD: avx_256
03:22:18:WU01:FS00:0xa7:************************************ System ************************************
03:22:18:WU01:FS00:0xa7:        CPU: AMD FX(tm)-8350 Eight-Core Processor
03:22:18:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
03:22:18:WU01:FS00:0xa7:       CPUs: 8
03:22:18:WU01:FS00:0xa7:     Memory: 15.90GiB
03:22:18:WU01:FS00:0xa7:Free Memory: 13.45GiB
03:22:18:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
03:22:18:WU01:FS00:0xa7: OS Version: 6.1
03:22:18:WU01:FS00:0xa7:Has Battery: false
03:22:18:WU01:FS00:0xa7: On Battery: false
03:22:18:WU01:FS00:0xa7: UTC Offset: -8
03:22:18:WU01:FS00:0xa7:        PID: 4996
03:22:18:WU01:FS00:0xa7:        CWD: C:\Users\erik\AppData\Roaming\FAHClient\work
03:22:18:WU01:FS00:0xa7:         OS: Windows 7 Ultimate Service Pack 1
03:22:18:WU01:FS00:0xa7:    OS Arch: AMD64
03:22:18:WU01:FS00:0xa7:********************************************************************************
03:22:18:WU01:FS00:0xa7:Project: 11920 (Run 872, Clone 5, Gen 24)
03:22:18:WU01:FS00:0xa7:Unit: 0x0000001cab4041295809c4d73c704df8
03:22:18:WU01:FS00:0xa7:Reading tar file core.xml
03:22:18:WU01:FS00:0xa7:Reading tar file frame24.tpr
03:22:18:WU01:FS00:0xa7:Digital signatures verified
03:22:18:WU01:FS00:0xa7:Calling: mdrun -s frame24.tpr -o frame24.trr -cpt 3 -nt 6
03:22:19:WU01:FS00:0xa7:Steps: first=1920000 total=80000
03:22:22:WU01:FS00:0xa7:Completed 1 out of 80000 steps (0%)
03:24:06:WU01:FS00:0xa7:Completed 800 out of 80000 steps (1%)
03:25:50:WU01:FS00:0xa7:Completed 1600 out of 80000 steps (2%)
03:27:33:WU01:FS00:0xa7:Completed 2400 out of 80000 steps (3%)
03:29:18:WU01:FS00:0xa7:Completed 3200 out of 80000 steps (4%)
03:31:01:WU01:FS00:0xa7:Completed 4000 out of 80000 steps (5%)
03:32:45:WU01:FS00:0xa7:Completed 4800 out of 80000 steps (6%)
03:34:30:WU01:FS00:0xa7:Completed 5600 out of 80000 steps (7%)
03:36:14:WU01:FS00:0xa7:Completed 6400 out of 80000 steps (8%)
03:37:58:WU01:FS00:0xa7:Completed 7200 out of 80000 steps (9%)
03:39:43:WU01:FS00:0xa7:Completed 8000 out of 80000 steps (10%)
03:41:27:WU01:FS00:0xa7:Completed 8800 out of 80000 steps (11%)
03:43:10:WU01:FS00:0xa7:Completed 9600 out of 80000 steps (12%)
03:44:55:WU01:FS00:0xa7:Completed 10400 out of 80000 steps (13%)
03:46:39:WU01:FS00:0xa7:Completed 11200 out of 80000 steps (14%)
03:48:23:WU01:FS00:0xa7:Completed 12000 out of 80000 steps (15%)
03:50:08:WU01:FS00:0xa7:Completed 12800 out of 80000 steps (16%)
03:51:51:WU01:FS00:0xa7:Completed 13600 out of 80000 steps (17%)
03:53:36:WU01:FS00:0xa7:Completed 14400 out of 80000 steps (18%)
03:55:19:WU01:FS00:0xa7:Completed 15200 out of 80000 steps (19%)
03:57:04:WU01:FS00:0xa7:Completed 16000 out of 80000 steps (20%)
03:58:48:WU01:FS00:0xa7:Completed 16800 out of 80000 steps (21%)
04:00:32:WU01:FS00:0xa7:Completed 17600 out of 80000 steps (22%)
04:02:16:WU01:FS00:0xa7:Completed 18400 out of 80000 steps (23%)
04:04:00:WU01:FS00:0xa7:Completed 19200 out of 80000 steps (24%)
04:05:44:WU01:FS00:0xa7:Completed 20000 out of 80000 steps (25%)
04:07:29:WU01:FS00:0xa7:Completed 20800 out of 80000 steps (26%)
04:09:13:WU01:FS00:0xa7:Completed 21600 out of 80000 steps (27%)
04:10:57:WU01:FS00:0xa7:Completed 22400 out of 80000 steps (28%)
04:12:42:WU01:FS00:0xa7:Completed 23200 out of 80000 steps (29%)
04:14:27:WU01:FS00:0xa7:Completed 24000 out of 80000 steps (30%)
04:16:10:WU01:FS00:0xa7:Completed 24800 out of 80000 steps (31%)
04:17:55:WU01:FS00:0xa7:Completed 25600 out of 80000 steps (32%)
04:19:39:WU01:FS00:0xa7:Completed 26400 out of 80000 steps (33%)
04:21:23:WU01:FS00:0xa7:Completed 27200 out of 80000 steps (34%)
04:23:07:WU01:FS00:0xa7:Completed 28000 out of 80000 steps (35%)
04:24:51:WU01:FS00:0xa7:Completed 28800 out of 80000 steps (36%)
04:26:36:WU01:FS00:0xa7:Completed 29600 out of 80000 steps (37%)
04:28:19:WU01:FS00:0xa7:Completed 30400 out of 80000 steps (38%)
04:30:04:WU01:FS00:0xa7:Completed 31200 out of 80000 steps (39%)
04:31:48:WU01:FS00:0xa7:Completed 32000 out of 80000 steps (40%)
04:33:32:WU01:FS00:0xa7:Completed 32800 out of 80000 steps (41%)
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:34:57:WU01:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:Fatal error:
04:34:57:WU01:FS00:0xa7:ERROR:28807 particles communicated to PME rank 3 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:34:57:WU01:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:34:57:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:34:57:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:34:57:WU01:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:Fatal error:
04:34:57:WU01:FS00:0xa7:ERROR:28602 particles communicated to PME rank 2 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:34:57:WU01:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:34:57:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:34:57:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:34:57:WU01:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:Fatal error:
04:34:57:WU01:FS00:0xa7:ERROR:28629 particles communicated to PME rank 5 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:34:57:WU01:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:34:57:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:34:57:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:34:57:WU01:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:Fatal error:
04:34:57:WU01:FS00:0xa7:ERROR:28547 particles communicated to PME rank 0 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:34:57:WU01:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:34:57:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:34:57:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:34:57:WU01:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:34:57:WU01:FS00:0xa7:ERROR:
04:34:57:WU01:FS00:0xa7:ERROR:Fatal error:
04:34:57:WU01:FS00:0xa7:ERROR:28600 particles communicated to PME rank 1 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:34:57:WU01:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:34:57:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:34:57:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:34:57:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
04:34:57:WU01:FS00:0xa7:WARNING:Unexpected exit() call
05:45:55:WARNING:WU01:FS00:FahCore returned: FAILED_3 (255 = 0xff)
05:45:55:WU01:FS00:Starting
05:45:55:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/erik/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe -dir 01 -suffix 01 -version 704 -lifeline 2700 -checkpoint 3 -np 6
05:45:55:WU01:FS00:Started FahCore on PID 2044
05:45:55:WU01:FS00:Core PID:1332
05:45:55:WU01:FS00:FahCore 0xa7 started
05:45:55:WU01:FS00:0xa7:*********************** Log Started 2016-11-14T05:45:55Z ***********************
05:45:55:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
05:45:55:WU01:FS00:0xa7:       Type: 0xa7
05:45:55:WU01:FS00:0xa7:       Core: Gromacs
05:45:55:WU01:FS00:0xa7:    Website: http://folding.stanford.edu/
05:45:55:WU01:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
05:45:55:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:45:55:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 704 -lifeline 2044 -checkpoint 3 -np 6
05:45:55:WU01:FS00:0xa7:     Config: <none>
05:45:55:WU01:FS00:0xa7:************************************ Build *************************************
05:45:55:WU01:FS00:0xa7:    Version: 0.0.11
05:45:55:WU01:FS00:0xa7:       Date: Sep 21 2016
05:45:55:WU01:FS00:0xa7:       Time: 01:43:48
05:45:55:WU01:FS00:0xa7: Repository: Git
05:45:55:WU01:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
05:45:55:WU01:FS00:0xa7:     Branch: master
05:45:55:WU01:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
05:45:55:WU01:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
05:45:55:WU01:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
05:45:55:WU01:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
05:45:55:WU01:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
05:45:55:WU01:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
05:45:55:WU01:FS00:0xa7:       Bits: 64
05:45:55:WU01:FS00:0xa7:       Mode: Release
05:45:55:WU01:FS00:0xa7:       SIMD: avx_256
05:45:55:WU01:FS00:0xa7:************************************ System ************************************
05:45:55:WU01:FS00:0xa7:        CPU: AMD FX(tm)-8350 Eight-Core Processor
05:45:55:WU01:FS00:0xa7:     CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
05:45:55:WU01:FS00:0xa7:       CPUs: 8
05:45:55:WU01:FS00:0xa7:     Memory: 15.90GiB
05:45:55:WU01:FS00:0xa7:Free Memory: 13.39GiB
05:45:55:WU01:FS00:0xa7:    Threads: WINDOWS_THREADS
05:45:55:WU01:FS00:0xa7: OS Version: 6.1
05:45:55:WU01:FS00:0xa7:Has Battery: false
05:45:55:WU01:FS00:0xa7: On Battery: false
05:45:55:WU01:FS00:0xa7: UTC Offset: -8
05:45:55:WU01:FS00:0xa7:        PID: 1332
05:45:55:WU01:FS00:0xa7:        CWD: C:\Users\erik\AppData\Roaming\FAHClient\work
05:45:55:WU01:FS00:0xa7:         OS: Windows 7 Ultimate Service Pack 1
05:45:55:WU01:FS00:0xa7:    OS Arch: AMD64
05:45:55:WU01:FS00:0xa7:********************************************************************************
05:45:55:WU01:FS00:0xa7:Project: 11920 (Run 872, Clone 5, Gen 24)
05:45:55:WU01:FS00:0xa7:Unit: 0x0000001cab4041295809c4d73c704df8
05:45:55:WU01:FS00:0xa7:Digital signatures verified
05:45:55:WU01:FS00:0xa7:Calling: mdrun -s frame24.tpr -o frame24.trr -cpi state.cpt -cpt 3 -nt 6
05:45:57:WU01:FS00:0xa7:Steps: first=1920000 total=80000
05:45:59:WU01:FS00:0xa7:Completed 33201 out of 80000 steps (41%)
05:46:51:WU01:FS00:0xa7:Completed 33600 out of 80000 steps (42%)
Last edited by bruce on Mon Nov 14, 2016 9:30 pm, edited 1 time in total.
Reason: Added CODE tags and filtered log for WU01:FS00
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Core a7 failing overnight with gromacs error

Post by jesanfafon »

When I've been leaving my computer folding overnight, most mornings I return to my machine and Windows has a message box saying "core a7 has stopped working". If I allow Folding to resume, the work unit completes without error.


My log file says this:

Code: Select all

04:55:59:WU00:FS00:0xa7:Completed 63200 out of 80000 steps (79%)
04:56:45:WU00:FS00:0xa7:Completed 64000 out of 80000 steps (80%)
04:57:30:WU00:FS00:0xa7:Completed 64800 out of 80000 steps (81%)
04:58:15:WU00:FS00:0xa7:Completed 65600 out of 80000 steps (82%)
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:58:30:WU00:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:Fatal error:
04:58:30:WU00:FS00:0xa7:ERROR:1 particles communicated to PME rank 0 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:58:30:WU00:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:58:30:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:58:30:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:58:30:WU00:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:Fatal error:
04:58:30:WU00:FS00:0xa7:ERROR:1 particles communicated to PME rank 1 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:58:30:WU00:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:58:30:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:58:30:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
******************************* Date: 2016-11-17 *******************************
******************************* Date: 2016-11-17 *******************************
13:39:26:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
I believe the FAILED_3 does not get logged until I press the "close" button on the Windows dialog.I'm worried this is an issue with the stability of my processor. Could it be? It's passed 24 hour stress-tests, but I've never had this issue with Folding @ Home.
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Core a7 failing overnight with gromacs error

Post by Joe_H »

I merged two reports of what appears to be the same error occurring with Core_A7 WU's.

@jesanfafon - Could you post the first 100 or so lines of your log to show the system and folding configuration information? The beginning section of the WU being started that generated this error in the log should also be posted so we know which WU is involved.

You also mentioned passing stress tests, are you overclocking your CPU? If so, this is a new folding core that uses different sections of the CPU. So what was stable for the previous Core_A4 may not be as stable.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Re: Core a7 failing overnight with gromacs error

Post by jesanfafon »

Joe_H wrote:I merged two reports of what appears to be the same error occurring with Core_A7 WU's.

@jesanfafon - Could you post the first 100 or so lines of your log to show the system and folding configuration information? The beginning section of the WU being started that generated this error in the log should also be posted so we know which WU is involved.

You also mentioned passing stress tests, are you overclocking your CPU? If so, this is a new folding core that uses different sections of the CPU. So what was stable for the previous Core_A4 may not be as stable.
I'm not running any overclock, but I did recently get a new processor, so I did the stress tests to try to verify my new CPU is okay.


Here's an expanded section of the log file:

Code: Select all

*********************** Log Started 2016-11-17T01:25:14Z ***********************
03:55:43:FS00:Unpaused
03:55:43:WU00:FS00:Connecting to 171.67.108.45:8080
03:55:43:WU00:FS00:Assigned to work server 171.64.65.41
03:55:43:WU00:FS00:Requesting new work unit for slot 00: READY cpu:6 from 171.64.65.41
03:55:43:WU00:FS00:Connecting to 171.64.65.41:8080
03:55:44:WU00:FS00:Downloading 20.96MiB
03:55:50:WU00:FS00:Download 17.89%
03:55:56:WU00:FS00:Download 39.05%
03:56:02:WU00:FS00:Download 61.12%
03:56:08:WU00:FS00:Download 95.10%
03:56:08:WU00:FS00:Download complete
03:56:08:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:11920 run:650 clone:2 gen:28 core:0xa7 unit:0x00000029ab4041295809c87508ff8da5
03:56:08:WU00:FS00:Starting
03:56:08:WU00:FS00:Running FahCore: "D:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\Jesse\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 704 -lifeline 12952 -checkpoint 20 -np 6
03:56:08:WU00:FS00:Started FahCore on PID 14416
03:56:08:WU00:FS00:Core PID:15312
03:56:08:WU00:FS00:FahCore 0xa7 started
03:56:09:WU00:FS00:0xa7:*********************** Log Started 2016-11-17T03:56:09Z ***********************
03:56:09:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
03:56:09:WU00:FS00:0xa7:       Type: 0xa7
03:56:09:WU00:FS00:0xa7:       Core: Gromacs
03:56:09:WU00:FS00:0xa7:    Website: http://folding.stanford.edu/
03:56:09:WU00:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
03:56:09:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
03:56:09:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 704 -lifeline 14416 -checkpoint 20 -np
03:56:09:WU00:FS00:0xa7:             6
03:56:09:WU00:FS00:0xa7:     Config: <none>
03:56:09:WU00:FS00:0xa7:************************************ Build *************************************
03:56:09:WU00:FS00:0xa7:    Version: 0.0.11
03:56:09:WU00:FS00:0xa7:       Date: Sep 21 2016
03:56:09:WU00:FS00:0xa7:       Time: 01:43:48
03:56:09:WU00:FS00:0xa7: Repository: Git
03:56:09:WU00:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
03:56:09:WU00:FS00:0xa7:     Branch: master
03:56:09:WU00:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
03:56:09:WU00:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
03:56:09:WU00:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
03:56:09:WU00:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
03:56:09:WU00:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
03:56:09:WU00:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
03:56:09:WU00:FS00:0xa7:       Bits: 64
03:56:09:WU00:FS00:0xa7:       Mode: Release
03:56:09:WU00:FS00:0xa7:       SIMD: avx_256
03:56:09:WU00:FS00:0xa7:************************************ System ************************************
03:56:09:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
03:56:09:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 94 Stepping 3
03:56:09:WU00:FS00:0xa7:       CPUs: 8
03:56:09:WU00:FS00:0xa7:     Memory: 31.96GiB
03:56:09:WU00:FS00:0xa7:Free Memory: 27.95GiB
03:56:09:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
03:56:09:WU00:FS00:0xa7: OS Version: 6.2
03:56:09:WU00:FS00:0xa7:Has Battery: false
03:56:09:WU00:FS00:0xa7: On Battery: false
03:56:09:WU00:FS00:0xa7: UTC Offset: -8
03:56:09:WU00:FS00:0xa7:        PID: 15312
03:56:09:WU00:FS00:0xa7:        CWD: C:\Users\Jesse\AppData\Roaming\FAHClient\work
03:56:09:WU00:FS00:0xa7:         OS: Windows 10 Pro Insider Preview
03:56:09:WU00:FS00:0xa7:    OS Arch: AMD64
03:56:09:WU00:FS00:0xa7:********************************************************************************
03:56:09:WU00:FS00:0xa7:Project: 11920 (Run 650, Clone 2, Gen 28)
03:56:09:WU00:FS00:0xa7:Unit: 0x00000029ab4041295809c87508ff8da5
03:56:09:WU00:FS00:0xa7:Reading tar file core.xml
03:56:09:WU00:FS00:0xa7:Reading tar file frame28.tpr
03:56:09:WU00:FS00:0xa7:Digital signatures verified
03:56:09:WU00:FS00:0xa7:Calling: mdrun -s frame28.tpr -o frame28.trr -cpt 20 -nt 6
03:56:10:WU00:FS00:0xa7:Steps: first=2240000 total=80000
03:56:12:WU00:FS00:0xa7:Completed 1 out of 80000 steps (0%)
03:56:58:WU00:FS00:0xa7:Completed 800 out of 80000 steps (1%)
03:57:43:WU00:FS00:0xa7:Completed 1600 out of 80000 steps (2%)
03:58:29:WU00:FS00:0xa7:Completed 2400 out of 80000 steps (3%)
03:59:14:WU00:FS00:0xa7:Completed 3200 out of 80000 steps (4%)
03:59:59:WU00:FS00:0xa7:Completed 4000 out of 80000 steps (5%)
04:00:45:WU00:FS00:0xa7:Completed 4800 out of 80000 steps (6%)
04:01:30:WU00:FS00:0xa7:Completed 5600 out of 80000 steps (7%)
04:02:15:WU00:FS00:0xa7:Completed 6400 out of 80000 steps (8%)
04:03:01:WU00:FS00:0xa7:Completed 7200 out of 80000 steps (9%)
04:03:46:WU00:FS00:0xa7:Completed 8000 out of 80000 steps (10%)
04:04:31:WU00:FS00:0xa7:Completed 8800 out of 80000 steps (11%)
04:05:17:WU00:FS00:0xa7:Completed 9600 out of 80000 steps (12%)
04:06:02:WU00:FS00:0xa7:Completed 10400 out of 80000 steps (13%)
04:06:47:WU00:FS00:0xa7:Completed 11200 out of 80000 steps (14%)
04:07:33:WU00:FS00:0xa7:Completed 12000 out of 80000 steps (15%)
04:08:18:WU00:FS00:0xa7:Completed 12800 out of 80000 steps (16%)
04:09:04:WU00:FS00:0xa7:Completed 13600 out of 80000 steps (17%)
04:09:50:WU00:FS00:0xa7:Completed 14400 out of 80000 steps (18%)
04:10:35:WU00:FS00:0xa7:Completed 15200 out of 80000 steps (19%)
04:11:20:WU00:FS00:0xa7:Completed 16000 out of 80000 steps (20%)
04:12:06:WU00:FS00:0xa7:Completed 16800 out of 80000 steps (21%)
04:12:51:WU00:FS00:0xa7:Completed 17600 out of 80000 steps (22%)
04:13:37:WU00:FS00:0xa7:Completed 18400 out of 80000 steps (23%)
04:14:23:WU00:FS00:0xa7:Completed 19200 out of 80000 steps (24%)
04:15:08:WU00:FS00:0xa7:Completed 20000 out of 80000 steps (25%)
04:15:53:WU00:FS00:0xa7:Completed 20800 out of 80000 steps (26%)
04:16:39:WU00:FS00:0xa7:Completed 21600 out of 80000 steps (27%)
04:17:25:WU00:FS00:0xa7:Completed 22400 out of 80000 steps (28%)
04:18:10:WU00:FS00:0xa7:Completed 23200 out of 80000 steps (29%)
04:18:55:WU00:FS00:0xa7:Completed 24000 out of 80000 steps (30%)
04:19:41:WU00:FS00:0xa7:Completed 24800 out of 80000 steps (31%)
04:20:26:WU00:FS00:0xa7:Completed 25600 out of 80000 steps (32%)
04:21:11:WU00:FS00:0xa7:Completed 26400 out of 80000 steps (33%)
04:21:57:WU00:FS00:0xa7:Completed 27200 out of 80000 steps (34%)
04:22:42:WU00:FS00:0xa7:Completed 28000 out of 80000 steps (35%)
04:23:28:WU00:FS00:0xa7:Completed 28800 out of 80000 steps (36%)
04:24:13:WU00:FS00:0xa7:Completed 29600 out of 80000 steps (37%)
04:24:59:WU00:FS00:0xa7:Completed 30400 out of 80000 steps (38%)
04:25:44:WU00:FS00:0xa7:Completed 31200 out of 80000 steps (39%)
04:26:30:WU00:FS00:0xa7:Completed 32000 out of 80000 steps (40%)
04:27:15:WU00:FS00:0xa7:Completed 32800 out of 80000 steps (41%)
04:28:00:WU00:FS00:0xa7:Completed 33600 out of 80000 steps (42%)
04:28:46:WU00:FS00:0xa7:Completed 34400 out of 80000 steps (43%)
04:29:31:WU00:FS00:0xa7:Completed 35200 out of 80000 steps (44%)
04:30:17:WU00:FS00:0xa7:Completed 36000 out of 80000 steps (45%)
04:31:02:WU00:FS00:0xa7:Completed 36800 out of 80000 steps (46%)
04:31:47:WU00:FS00:0xa7:Completed 37600 out of 80000 steps (47%)
04:32:32:WU00:FS00:0xa7:Completed 38400 out of 80000 steps (48%)
04:33:18:WU00:FS00:0xa7:Completed 39200 out of 80000 steps (49%)
04:34:03:WU00:FS00:0xa7:Completed 40000 out of 80000 steps (50%)
04:34:49:WU00:FS00:0xa7:Completed 40800 out of 80000 steps (51%)
04:35:34:WU00:FS00:0xa7:Completed 41600 out of 80000 steps (52%)
04:36:20:WU00:FS00:0xa7:Completed 42400 out of 80000 steps (53%)
04:37:05:WU00:FS00:0xa7:Completed 43200 out of 80000 steps (54%)
04:37:50:WU00:FS00:0xa7:Completed 44000 out of 80000 steps (55%)
04:38:36:WU00:FS00:0xa7:Completed 44800 out of 80000 steps (56%)
04:39:22:WU00:FS00:0xa7:Completed 45600 out of 80000 steps (57%)
04:40:07:WU00:FS00:0xa7:Completed 46400 out of 80000 steps (58%)
04:40:52:WU00:FS00:0xa7:Completed 47200 out of 80000 steps (59%)
04:41:38:WU00:FS00:0xa7:Completed 48000 out of 80000 steps (60%)
04:42:23:WU00:FS00:0xa7:Completed 48800 out of 80000 steps (61%)
04:43:09:WU00:FS00:0xa7:Completed 49600 out of 80000 steps (62%)
04:43:54:WU00:FS00:0xa7:Completed 50400 out of 80000 steps (63%)
04:44:40:WU00:FS00:0xa7:Completed 51200 out of 80000 steps (64%)
04:45:25:WU00:FS00:0xa7:Completed 52000 out of 80000 steps (65%)
04:46:10:WU00:FS00:0xa7:Completed 52800 out of 80000 steps (66%)
04:46:56:WU00:FS00:0xa7:Completed 53600 out of 80000 steps (67%)
04:47:41:WU00:FS00:0xa7:Completed 54400 out of 80000 steps (68%)
04:48:26:WU00:FS00:0xa7:Completed 55200 out of 80000 steps (69%)
04:49:11:WU00:FS00:0xa7:Completed 56000 out of 80000 steps (70%)
04:49:56:WU00:FS00:0xa7:Completed 56800 out of 80000 steps (71%)
04:50:42:WU00:FS00:0xa7:Completed 57600 out of 80000 steps (72%)
04:51:27:WU00:FS00:0xa7:Completed 58400 out of 80000 steps (73%)
04:52:12:WU00:FS00:0xa7:Completed 59200 out of 80000 steps (74%)
04:52:57:WU00:FS00:0xa7:Completed 60000 out of 80000 steps (75%)
04:53:43:WU00:FS00:0xa7:Completed 60800 out of 80000 steps (76%)
04:54:28:WU00:FS00:0xa7:Completed 61600 out of 80000 steps (77%)
04:55:14:WU00:FS00:0xa7:Completed 62400 out of 80000 steps (78%)
04:55:59:WU00:FS00:0xa7:Completed 63200 out of 80000 steps (79%)
04:56:45:WU00:FS00:0xa7:Completed 64000 out of 80000 steps (80%)
04:57:30:WU00:FS00:0xa7:Completed 64800 out of 80000 steps (81%)
04:58:15:WU00:FS00:0xa7:Completed 65600 out of 80000 steps (82%)
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:58:30:WU00:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:Fatal error:
04:58:30:WU00:FS00:0xa7:ERROR:1 particles communicated to PME rank 0 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:58:30:WU00:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:58:30:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:58:30:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
04:58:30:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
04:58:30:WU00:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
04:58:30:WU00:FS00:0xa7:ERROR:
04:58:30:WU00:FS00:0xa7:ERROR:Fatal error:
04:58:30:WU00:FS00:0xa7:ERROR:1 particles communicated to PME rank 1 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
04:58:30:WU00:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
04:58:30:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
04:58:30:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
04:58:30:WU00:FS00:0xa7:ERROR:-------------------------------------------------------
******************************* Date: 2016-11-17 *******************************
******************************* Date: 2016-11-17 *******************************
13:39:26:WARNING:WU00:FS00:FahCore returned: FAILED_3 (255 = 0xff)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core a7 failing overnight with gromacs error

Post by bruce »

jesanfafon wrote:When I've been leaving my computer folding overnight, most mornings I return to my machine and Windows has a message box saying "core a7 has stopped working"
I believe that's a message indicating that Windows detected an error. Is there any additional information either available after the first pop-up or in the Windows Event Log at the time the popup first appeared?

The gromacs error about particles being "more than 2/3 times the cut-off ..." can be caused by several things.

1) Sometimes (rarely) it's an indication that folding is progressing unexpectedly fast. This has traditionally been handled by restarting from the previous checkpoint and in many cases the WU will continue to fold.

2) Calculation errors (including, but not limited to overclocking or overheating) can relocate atoms to an incorrect position forcing the calculation to terminate.

3) Some of the initial atomic locations may be "strange" but that can only be corrected by the Project Owner --- and reporting the PRCG of the WU is an important part of that report.

4) Other things.
--------------
If it's item 1 or 3, the initial conditions can be re-equilibrated, solving the problem.

If it's item 2, Burn-in testing of your CPU may detect a hardware problem ... but that's generally unnecessary unless you have a series of similar failures on various WUs.

Does anybody know of a reliable benchmark that puts maximum stress on the AVX hardware?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core a7 failing overnight with gromacs error

Post by bruce »

jesanfafon wrote:Here's an expanded section of the log file:...
Ah, but you didn't include the configuration information that's on the first couple pages of the log. Linux? Windows 7, 8, or 10, Mac OSX?
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Re: Core a7 failing overnight with gromacs error

Post by jesanfafon »

bruce wrote:
jesanfafon wrote:Here's an expanded section of the log file:...
Ah, but you didn't include the configuration information that's on the first couple pages of the log. Linux? Windows 7, 8, or 10, Mac OSX?

I think that's in the snippet I posted.
03:56:09:WU00:FS00:0xa7: OS: Windows 10 Pro Insider Preview
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Re: Core a7 failing overnight with gromacs error

Post by jesanfafon »

I had the same error on Project 11920 (Run 863, Clone 4, Gen 37). Which is the same project as before and for @forcinghavok. Maybe it's the project?

Code: Select all

22:19:52:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11920 run:863 clone:4 gen:37 core:0xa7 unit:0x0000002fab4041295809c4fd67fd6a84
22:19:52:WU02:FS00:Starting
22:19:52:WU02:FS00:Running FahCore: "D:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\Jesse\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe -dir 02 -suffix 01 -version 704 -lifeline 12952 -checkpoint 20 -np 6
22:19:52:WU02:FS00:Started FahCore on PID 1888
22:19:52:WU02:FS00:Core PID:6192
22:19:52:WU02:FS00:FahCore 0xa7 started
22:19:53:WU02:FS00:0xa7:*********************** Log Started 2016-11-18T22:19:52Z ***********************
22:19:53:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
22:19:53:WU02:FS00:0xa7:       Type: 0xa7
22:19:53:WU02:FS00:0xa7:       Core: Gromacs
22:19:53:WU02:FS00:0xa7:    Website: http://folding.stanford.edu/
22:19:53:WU02:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
22:19:53:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:19:53:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 704 -lifeline 1888 -checkpoint 20 -np 6
22:19:53:WU02:FS00:0xa7:     Config: <none>
22:19:53:WU02:FS00:0xa7:************************************ Build *************************************
22:19:53:WU02:FS00:0xa7:    Version: 0.0.11
22:19:53:WU02:FS00:0xa7:       Date: Sep 21 2016
22:19:53:WU02:FS00:0xa7:       Time: 01:43:48
22:19:53:WU02:FS00:0xa7: Repository: Git
22:19:53:WU02:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
22:19:53:WU02:FS00:0xa7:     Branch: master
22:19:53:WU02:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
22:19:53:WU02:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
22:19:53:WU02:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
22:19:53:WU02:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
22:19:53:WU02:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
22:19:53:WU02:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
22:19:53:WU02:FS00:0xa7:       Bits: 64
22:19:53:WU02:FS00:0xa7:       Mode: Release
22:19:53:WU02:FS00:0xa7:       SIMD: avx_256
22:19:53:WU02:FS00:0xa7:************************************ System ************************************
22:19:53:WU02:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
22:19:53:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 94 Stepping 3
22:19:53:WU02:FS00:0xa7:       CPUs: 8
22:19:53:WU02:FS00:0xa7:     Memory: 31.96GiB
22:19:53:WU02:FS00:0xa7:Free Memory: 27.09GiB
22:19:53:WU02:FS00:0xa7:    Threads: WINDOWS_THREADS
22:19:53:WU02:FS00:0xa7: OS Version: 6.2
22:19:53:WU02:FS00:0xa7:Has Battery: false
22:19:53:WU02:FS00:0xa7: On Battery: false
22:19:53:WU02:FS00:0xa7: UTC Offset: -8
22:19:53:WU02:FS00:0xa7:        PID: 6192
22:19:53:WU02:FS00:0xa7:        CWD: C:\Users\Jesse\AppData\Roaming\FAHClient\work
22:19:53:WU02:FS00:0xa7:         OS: Windows 10 Pro Insider Preview
22:19:53:WU02:FS00:0xa7:    OS Arch: AMD64
22:19:53:WU02:FS00:0xa7:********************************************************************************
22:19:53:WU02:FS00:0xa7:Project: 11920 (Run 863, Clone 4, Gen 37)
22:19:53:WU02:FS00:0xa7:Unit: 0x0000002fab4041295809c4fd67fd6a84
22:19:53:WU02:FS00:0xa7:Reading tar file core.xml
22:19:53:WU02:FS00:0xa7:Reading tar file frame37.tpr
22:19:53:WU02:FS00:0xa7:Digital signatures verified
22:19:53:WU02:FS00:0xa7:Calling: mdrun -s frame37.tpr -o frame37.trr -cpt 20 -nt 6
22:19:53:WU02:FS00:0xa7:Steps: first=2960000 total=80000
22:19:55:WU02:FS00:0xa7:Completed 1 out of 80000 steps (0%)
22:19:58:WU00:FS00:Upload 7.12%
22:20:04:WU00:FS00:Upload 15.73%
22:20:10:WU00:FS00:Upload 24.33%
22:20:16:WU00:FS00:Upload 33.23%
22:20:22:WU00:FS00:Upload 41.84%
22:20:28:WU00:FS00:Upload 50.44%
22:20:34:WU00:FS00:Upload 59.05%
22:20:40:WU00:FS00:Upload 67.36%
22:20:42:WU02:FS00:0xa7:Completed 800 out of 80000 steps (1%)
22:20:46:WU00:FS00:Upload 75.96%
22:20:52:WU00:FS00:Upload 84.27%
22:20:58:WU00:FS00:Upload 92.58%
22:21:07:WU00:FS00:Upload complete
22:21:07:WU00:FS00:Server responded WORK_ACK (400)
22:21:07:WU00:FS00:Final credit estimate, 11663.00 points
22:21:07:WU00:FS00:Cleaning up
22:21:30:WU02:FS00:0xa7:Completed 1600 out of 80000 steps (2%)
22:22:15:WU02:FS00:0xa7:Completed 2400 out of 80000 steps (3%)
22:23:06:WU02:FS00:0xa7:Completed 3200 out of 80000 steps (4%)
22:23:54:WU02:FS00:0xa7:Completed 4000 out of 80000 steps (5%)
22:24:41:WU02:FS00:0xa7:Completed 4800 out of 80000 steps (6%)
22:25:32:WU02:FS00:0xa7:Completed 5600 out of 80000 steps (7%)
22:26:18:WU02:FS00:0xa7:Completed 6400 out of 80000 steps (8%)
22:27:03:WU02:FS00:0xa7:Completed 7200 out of 80000 steps (9%)
22:27:49:WU02:FS00:0xa7:Completed 8000 out of 80000 steps (10%)
22:28:34:WU02:FS00:0xa7:Completed 8800 out of 80000 steps (11%)
22:29:19:WU02:FS00:0xa7:Completed 9600 out of 80000 steps (12%)
22:30:05:WU02:FS00:0xa7:Completed 10400 out of 80000 steps (13%)
22:30:50:WU02:FS00:0xa7:Completed 11200 out of 80000 steps (14%)
22:31:36:WU02:FS00:0xa7:Completed 12000 out of 80000 steps (15%)
22:32:22:WU02:FS00:0xa7:Completed 12800 out of 80000 steps (16%)
22:33:10:WU02:FS00:0xa7:Completed 13600 out of 80000 steps (17%)
22:33:57:WU02:FS00:0xa7:Completed 14400 out of 80000 steps (18%)
22:34:46:WU02:FS00:0xa7:Completed 15200 out of 80000 steps (19%)
22:35:33:WU02:FS00:0xa7:Completed 16000 out of 80000 steps (20%)
22:36:21:WU02:FS00:0xa7:Completed 16800 out of 80000 steps (21%)
22:37:06:WU02:FS00:0xa7:Completed 17600 out of 80000 steps (22%)
22:37:56:WU02:FS00:0xa7:Completed 18400 out of 80000 steps (23%)
22:38:41:WU02:FS00:0xa7:Completed 19200 out of 80000 steps (24%)
22:39:27:WU02:FS00:0xa7:Completed 20000 out of 80000 steps (25%)
22:40:16:WU02:FS00:0xa7:Completed 20800 out of 80000 steps (26%)
22:41:02:WU02:FS00:0xa7:Completed 21600 out of 80000 steps (27%)
22:41:51:WU02:FS00:0xa7:Completed 22400 out of 80000 steps (28%)
22:42:40:WU02:FS00:0xa7:Completed 23200 out of 80000 steps (29%)
22:43:25:WU02:FS00:0xa7:Completed 24000 out of 80000 steps (30%)
22:44:12:WU02:FS00:0xa7:Completed 24800 out of 80000 steps (31%)
22:44:58:WU02:FS00:0xa7:Completed 25600 out of 80000 steps (32%)
22:45:43:WU02:FS00:0xa7:Completed 26400 out of 80000 steps (33%)
22:46:33:WU02:FS00:0xa7:Completed 27200 out of 80000 steps (34%)
22:47:18:WU02:FS00:0xa7:Completed 28000 out of 80000 steps (35%)
22:48:04:WU02:FS00:0xa7:Completed 28800 out of 80000 steps (36%)
22:48:49:WU02:FS00:0xa7:Completed 29600 out of 80000 steps (37%)
22:49:16:WU02:FS00:0xa7:ERROR:
22:49:16:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
22:49:16:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
22:49:16:WU02:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
22:49:16:WU02:FS00:0xa7:ERROR:
22:49:16:WU02:FS00:0xa7:ERROR:Fatal error:
22:49:16:WU02:FS00:0xa7:ERROR:4 particles communicated to PME rank 2 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
22:49:16:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
22:49:16:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
22:49:16:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
22:49:16:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
22:49:21:WU02:FS00:0xa7:ERROR:
22:49:21:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
22:49:21:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
22:49:21:WU02:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
22:49:21:WU02:FS00:0xa7:ERROR:
22:49:21:WU02:FS00:0xa7:ERROR:Fatal error:
22:49:21:WU02:FS00:0xa7:ERROR:5 particles communicated to PME rank 2 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
22:49:21:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
22:49:21:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
22:49:21:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
22:49:21:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Re: Core a7 failing overnight with gromacs error

Post by jesanfafon »

Don't know how useful this is, but another failure for me on Project 11920

Code: Select all

22:19:51:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11920 run:386 clone:6 gen:78 core:0xa7 unit:0x00000061ab4041295809ccbe6ea8cb29
22:19:51:WU02:FS00:Starting
22:19:51:WU02:FS00:Running FahCore: "D:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\Jesse\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/AVX/beta/Core_a7.fah/FahCore_a7.exe -dir 02 -suffix 01 -version 704 -lifeline 10436 -checkpoint 20 -np 6
22:19:51:WU02:FS00:Started FahCore on PID 5632
22:19:51:WU02:FS00:Core PID:10296
22:19:51:WU02:FS00:FahCore 0xa7 started
22:19:52:WU02:FS00:0xa7:*********************** Log Started 2016-11-25T22:19:52Z ***********************
22:19:52:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
22:19:52:WU02:FS00:0xa7:       Type: 0xa7
22:19:52:WU02:FS00:0xa7:       Core: Gromacs
22:19:52:WU02:FS00:0xa7:    Website: http://folding.stanford.edu/
22:19:52:WU02:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
22:19:52:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:19:52:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 704 -lifeline 5632 -checkpoint 20 -np 6
22:19:52:WU02:FS00:0xa7:     Config: <none>
22:19:52:WU02:FS00:0xa7:************************************ Build *************************************
22:19:52:WU02:FS00:0xa7:    Version: 0.0.11
22:19:52:WU02:FS00:0xa7:       Date: Sep 21 2016
22:19:52:WU02:FS00:0xa7:       Time: 01:43:48
22:19:52:WU02:FS00:0xa7: Repository: Git
22:19:52:WU02:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
22:19:52:WU02:FS00:0xa7:     Branch: master
22:19:52:WU02:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
22:19:52:WU02:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
22:19:52:WU02:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
22:19:52:WU02:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
22:19:52:WU02:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
22:19:52:WU02:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
22:19:52:WU02:FS00:0xa7:       Bits: 64
22:19:52:WU02:FS00:0xa7:       Mode: Release
22:19:52:WU02:FS00:0xa7:       SIMD: avx_256
22:19:52:WU02:FS00:0xa7:************************************ System ************************************
22:19:52:WU02:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
22:19:52:WU02:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 94 Stepping 3
22:19:52:WU02:FS00:0xa7:       CPUs: 8
22:19:52:WU02:FS00:0xa7:     Memory: 31.96GiB
22:19:52:WU02:FS00:0xa7:Free Memory: 27.00GiB
22:19:52:WU02:FS00:0xa7:    Threads: WINDOWS_THREADS
22:19:52:WU02:FS00:0xa7: OS Version: 6.2
22:19:52:WU02:FS00:0xa7:Has Battery: false
22:19:52:WU02:FS00:0xa7: On Battery: false
22:19:52:WU02:FS00:0xa7: UTC Offset: -8
22:19:52:WU02:FS00:0xa7:        PID: 10296
22:19:52:WU02:FS00:0xa7:        CWD: C:\Users\Jesse\AppData\Roaming\FAHClient\work
22:19:52:WU02:FS00:0xa7:         OS: Windows 10 Pro Insider Preview
22:19:52:WU02:FS00:0xa7:    OS Arch: AMD64
22:19:52:WU02:FS00:0xa7:********************************************************************************
22:19:52:WU02:FS00:0xa7:Project: 11920 (Run 386, Clone 6, Gen 78)
22:19:52:WU02:FS00:0xa7:Unit: 0x00000061ab4041295809ccbe6ea8cb29
22:19:52:WU02:FS00:0xa7:Reading tar file core.xml
22:19:52:WU02:FS00:0xa7:Reading tar file frame78.tpr
22:19:52:WU02:FS00:0xa7:Digital signatures verified
22:19:52:WU02:FS00:0xa7:Calling: mdrun -s frame78.tpr -o frame78.trr -cpt 20 -nt 6
22:19:53:WU02:FS00:0xa7:Steps: first=6240000 total=80000
22:19:55:WU02:FS00:0xa7:Completed 1 out of 80000 steps (0%)
22:20:39:WU02:FS00:0xa7:Completed 800 out of 80000 steps (1%)
22:21:24:WU02:FS00:0xa7:Completed 1600 out of 80000 steps (2%)
22:22:09:WU02:FS00:0xa7:Completed 2400 out of 80000 steps (3%)
22:22:54:WU02:FS00:0xa7:Completed 3200 out of 80000 steps (4%)
22:23:40:WU02:FS00:0xa7:Completed 4000 out of 80000 steps (5%)
22:24:25:WU02:FS00:0xa7:Completed 4800 out of 80000 steps (6%)
22:25:10:WU02:FS00:0xa7:Completed 5600 out of 80000 steps (7%)
22:25:55:WU02:FS00:0xa7:Completed 6400 out of 80000 steps (8%)
22:26:39:WU02:FS00:0xa7:Completed 7200 out of 80000 steps (9%)
22:27:24:WU02:FS00:0xa7:Completed 8000 out of 80000 steps (10%)
22:28:09:WU02:FS00:0xa7:Completed 8800 out of 80000 steps (11%)
22:28:54:WU02:FS00:0xa7:Completed 9600 out of 80000 steps (12%)
22:29:39:WU02:FS00:0xa7:Completed 10400 out of 80000 steps (13%)
22:30:24:WU02:FS00:0xa7:Completed 11200 out of 80000 steps (14%)
22:31:09:WU02:FS00:0xa7:Completed 12000 out of 80000 steps (15%)
22:31:54:WU02:FS00:0xa7:Completed 12800 out of 80000 steps (16%)
22:32:38:WU02:FS00:0xa7:Completed 13600 out of 80000 steps (17%)
22:33:23:WU02:FS00:0xa7:Completed 14400 out of 80000 steps (18%)
22:34:08:WU02:FS00:0xa7:Completed 15200 out of 80000 steps (19%)
22:34:53:WU02:FS00:0xa7:Completed 16000 out of 80000 steps (20%)
22:35:37:WU02:FS00:0xa7:Completed 16800 out of 80000 steps (21%)
22:36:22:WU02:FS00:0xa7:Completed 17600 out of 80000 steps (22%)
22:37:07:WU02:FS00:0xa7:Completed 18400 out of 80000 steps (23%)
22:37:51:WU02:FS00:0xa7:Completed 19200 out of 80000 steps (24%)
22:38:36:WU02:FS00:0xa7:Completed 20000 out of 80000 steps (25%)
22:39:21:WU02:FS00:0xa7:Completed 20800 out of 80000 steps (26%)
22:40:07:WU02:FS00:0xa7:Completed 21600 out of 80000 steps (27%)
22:40:52:WU02:FS00:0xa7:Completed 22400 out of 80000 steps (28%)
22:41:37:WU02:FS00:0xa7:Completed 23200 out of 80000 steps (29%)
22:42:21:WU02:FS00:0xa7:Completed 24000 out of 80000 steps (30%)
22:43:06:WU02:FS00:0xa7:Completed 24800 out of 80000 steps (31%)
22:43:51:WU02:FS00:0xa7:Completed 25600 out of 80000 steps (32%)
22:44:36:WU02:FS00:0xa7:Completed 26400 out of 80000 steps (33%)
22:45:21:WU02:FS00:0xa7:Completed 27200 out of 80000 steps (34%)
22:46:06:WU02:FS00:0xa7:Completed 28000 out of 80000 steps (35%)
22:46:51:WU02:FS00:0xa7:Completed 28800 out of 80000 steps (36%)
22:47:36:WU02:FS00:0xa7:Completed 29600 out of 80000 steps (37%)
22:48:20:WU02:FS00:0xa7:Completed 30400 out of 80000 steps (38%)
22:49:05:WU02:FS00:0xa7:Completed 31200 out of 80000 steps (39%)
22:49:50:WU02:FS00:0xa7:Completed 32000 out of 80000 steps (40%)
22:50:35:WU02:FS00:0xa7:Completed 32800 out of 80000 steps (41%)
22:51:19:WU02:FS00:0xa7:Completed 33600 out of 80000 steps (42%)
22:52:04:WU02:FS00:0xa7:Completed 34400 out of 80000 steps (43%)
22:52:49:WU02:FS00:0xa7:Completed 35200 out of 80000 steps (44%)
22:53:34:WU02:FS00:0xa7:Completed 36000 out of 80000 steps (45%)
22:54:19:WU02:FS00:0xa7:Completed 36800 out of 80000 steps (46%)
22:55:03:WU02:FS00:0xa7:Completed 37600 out of 80000 steps (47%)
22:55:48:WU02:FS00:0xa7:Completed 38400 out of 80000 steps (48%)
22:56:35:WU02:FS00:0xa7:Completed 39200 out of 80000 steps (49%)
22:57:20:WU02:FS00:0xa7:Completed 40000 out of 80000 steps (50%)
22:58:05:WU02:FS00:0xa7:Completed 40800 out of 80000 steps (51%)
22:58:51:WU02:FS00:0xa7:Completed 41600 out of 80000 steps (52%)
22:59:36:WU02:FS00:0xa7:Completed 42400 out of 80000 steps (53%)
23:00:22:WU02:FS00:0xa7:Completed 43200 out of 80000 steps (54%)
23:01:08:WU02:FS00:0xa7:Completed 44000 out of 80000 steps (55%)
23:01:53:WU02:FS00:0xa7:Completed 44800 out of 80000 steps (56%)
23:02:38:WU02:FS00:0xa7:Completed 45600 out of 80000 steps (57%)
23:03:24:WU02:FS00:0xa7:Completed 46400 out of 80000 steps (58%)
23:04:09:WU02:FS00:0xa7:Completed 47200 out of 80000 steps (59%)
23:04:55:WU02:FS00:0xa7:Completed 48000 out of 80000 steps (60%)
23:05:41:WU02:FS00:0xa7:Completed 48800 out of 80000 steps (61%)
23:06:23:WU02:FS00:0xa7:ERROR:
23:06:23:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
23:06:23:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
23:06:23:WU02:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
23:06:23:WU02:FS00:0xa7:ERROR:
23:06:23:WU02:FS00:0xa7:ERROR:Fatal error:
23:06:23:WU02:FS00:0xa7:ERROR:2 particles communicated to PME rank 4 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
23:06:23:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
23:06:23:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
23:06:23:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
23:06:23:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
23:06:23:WU02:FS00:0xa7:ERROR:
23:06:23:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
23:06:23:WU02:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20160919-669094a-unknown
23:06:23:WU02:FS00:0xa7:ERROR:Source code file: /host/windows-cross-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/pme.c, line: 754
23:06:23:WU02:FS00:0xa7:ERROR:
23:06:23:WU02:FS00:0xa7:ERROR:Fatal error:
23:06:23:WU02:FS00:0xa7:ERROR:2 particles communicated to PME rank 3 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
23:06:23:WU02:FS00:0xa7:ERROR:This usually means that your system is not well equilibrated.
23:06:23:WU02:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
23:06:23:WU02:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
23:06:23:WU02:FS00:0xa7:ERROR:-------------------------------------------------------
01:05:22:WARNING:WU02:FS00:FahCore returned an unknown error code which probably indicates that it crashed
01:05:22:WARNING:WU02:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
Forcinghavok
Posts: 130
Joined: Wed Feb 06, 2013 4:46 pm

Re: Core a7 failing overnight with gromacs error

Post by Forcinghavok »

Hey thanks for posting about this jesanfafon, I was thinking I was the only one with fah problems. I am going to run a cpu and gpu stress test. The memory test came back fine. Its a new system so I don't foresee there being problems with the computer.
jesanfafon
Posts: 9
Joined: Wed Sep 14, 2016 3:01 pm

Re: Core a7 failing overnight with gromacs error

Post by jesanfafon »

Forcinghavok wrote:Hey thanks for posting about this jesanfafon, I was thinking I was the only one with fah problems. I am going to run a cpu and gpu stress test. The memory test came back fine. Its a new system so I don't foresee there being problems with the computer.
Since all of the errors seem to be on the one project, I'm guessing it is sometimes poorly equalibriated during simulation and needs the restart as a fix-up step. As in, our PCs are fine :D
GPU timpster
Posts: 65
Joined: Mon Nov 02, 2015 2:57 am

Re: Core a7 failing overnight with gromacs error

Post by GPU timpster »

I'm not signed up for the beta stuff, but I did just switch the flag over to beta, from advanced, and I'll also report any issues!
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Core a7 failing overnight with gromacs error

Post by Joe_H »

The beta flag is not required for this project.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core a7 failing overnight with gromacs error

Post by bruce »

jesanfafon wrote:
Forcinghavok wrote:Its a new system so I don't foresee there being problems with the computer.
Since all of the errors seem to be on the one project, I'm guessing it is sometimes poorly equalibriated during simulation and needs the restart as a fix-up step. As in, our PCs are fine :D
There are three possiblites: Computer, Project, FAHCore_a7.

In fact, Project 11920 is a test project which is a duplicate of project 9752 that was previously run using FahCore_a4. Those two FAHCores use a different version of GROMACS but realistically, you'd expect the results to be identical. Since the errors are showing only on Core_a7, the scientists need to dig into the data and decide which set of results are more correct.
Post Reply