Project: 3064 (Run 3, Clone 165, Gen 8) dies @ 32%

Moderators: Site Moderators, FAHC Science Team

Post Reply
dschief
Posts: 146
Joined: Tue Dec 04, 2007 5:56 am
Hardware configuration: ASUS P5K-E, Q6600/ 8 gig ram Win-7

2X ASUS z97-K 16 G Ram Win-7_64

Project: 3064 (Run 3, Clone 165, Gen 8) dies @ 32%

Post by dschief »

Died twice at same point: now on different Wu/

Code: Select all

starting mdrun 'p3064_lambda5_2003'
5000000 steps,  10000.0 ps.
[02:12:55] files
[02:12:55] Completed 0 out of 5000000 steps  (0 percent)
[02:12:55] a SSE boost OK.
[02:27:55] nt triggered.
[02:28:56] Writing local files
[02:28:56] Completed 50000 out of 5000000 steps  (1 percent)
[02:43:55] Timered checkpoint triggered.
[02:44:54] Writing local files
[02:44:54] Completed 100000 out of 5000000 steps  (2 percent)

snip

[10:28:44] Completed 1550000 out of 5000000 steps  (31 percent)
[10:43:44] Timered checkpoint triggered.
[10:44:43] Writing local files
[10:44:43] Completed 1600000 out of 5000000 steps  (32 percent)
[cli_2]: aborting job:
Fatal error in [10:53:10] Warning:  long 1-4 interactions
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 1
[0]3:Return code = 0, signaled with Segmentation fault
[10:53:14] CoreStatus = 1 (1)
[10:53:14] Client-core communications error: ERROR 0x1
[10:53:14] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 18
[0]3:Return code = 0, signaled with Quit
[10:57:35] - Warning: Could not delete all work unit files (1): Core returned invalid code
[10:57:35] Trying to send all finished work units
[10:57:35] + No unsent completed units remaining.
[10:57:35] - Preparing to get new work unit...
[10:57:35] + Attempting to get work packet
[10:57:35] - Will indicate memory of 2013 MB
[10:57:35] - Connecting to assignment server
[10:57:35] Connecting to http://assign.stanford.edu:8080/
[10:57:35] Posted data.
[10:57:35] Initial: 40AB; - Successful: assigned to (171.64.65.63).
[10:57:35] + News From Folding@Home: Welcome to Folding@Home
[10:57:35] Loaded queue successfully.
[10:57:35] Connecting to http://171.64.65.63:8080/
[10:57:36] Posted data.
[10:57:36] Initial: 0000; - Receiving payload (expected size: 610205)
[10:57:40] - Downloaded at ~148 kB/s
[10:57:40] - Averaged speed for that direction ~154 kB/s
[10:57:40] + Received work.
[10:57:40] + Closed connections
[10:57:45] 
[10:57:45] + Processing work unit
[10:57:45] Core required: FahCore_a1.exe
[10:57:45] Core found.
[10:57:45] Working on Unit 02 [May 3 10:57:45]
[10:57:45] + Working ...
[10:57:45] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 02 -priority 96 -checkpoint 15 -verbose -lifeline 4313 -version 601'

[10:57:45] 
[10:57:45] *------------------------------*
[10:57:45] Folding@Home Gromacs SMP Core
[10:57:45] Version 1.74 (November 27, 2006)
[10:57:45] 
[10:57:45] Preparing to commence simulation
[10:57:45] - Ensuring status. Please wait.
[10:58:02] - Looking at optimizations...
[10:58:02] - Working with standard loops on this execution.
[10:58:02] - Previous termination of core was improper.
[10:58:02] - Going to use standard loops.
[10:58:02] - Files status OK
[10:58:03] arting from initial work packet
[10:58:03] 
[10:58:03] Project: 30nitial work packet
[10:58:03] 
[10:58:03] Project: 3Entering M.D.
[10:58:03] one 165, Gen 8)
[10:58:03] 
[10:58:03] Entering M.D.
[10:58:03] one 165, Gen 8)
[10:58:03] 
[10:58:03] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=2, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=3, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=1, HOSTNAME=localhost.localdomain
NODEID=2 argc=15
NODEID=0 argc=15
NODEID=1 argc=15
NODEID=3 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit http://www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

starting mdrun 'p3064_lambda5_2003'
5000000 steps,  10000.0 ps.

[10:58:09]  files
[10:58:09] Extra SSE boost OK.
[10:58:09] Writing local files
[10:58:09] Completed 0 out of 5000000 steps  (0 percent)
[11:13:10] Timered checkpoint triggered.
[11:14:11] Writing local files
[11:14:11] Completed 50000 out of 5000000 steps  (1 percent)
[11:29:12] Timered checkpoint triggered.
[11:30:10] Writing local files
[11:30:10] Completed 100000 out of 5000000 steps  (2 percent)

snip

[19:13:59] Completed 1550000 out of 5000000 steps  (31 percent)
[19:29:00] Timered checkpoint triggered.
[19:30:00] Writing local files
[19:30:00] Completed 1600000 out of 5000000 steps  (32 percent)
[19:38:27] Warning:  long 1-4 interactions
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Segmentation fault
[19:38:31] CoreStatus = 0 (0)
[19:38:31] Client-core communications error: ERROR 0x0
[19:38:31] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 18
[19:42:52] - Warning: Could not delete all work unit files (2): Core returned invalid code
[19:42:52] Trying to send all finished work units
[19:42:52] + No unsent completed units remaining.
[19:42:52] - Preparing to get new work unit...
[19:42:52] + Attempting to get work packet
[19:42:52] - Will indicate memory of 2013 MB
[19:42:52] - Connecting to assignment server
[19:42:52] Connecting to http://assign.stanford.edu:8080/
[19:42:53] Posted data.
[19:42:53] Initial: 40AB; - Successful: assigned to (171.64.65.63).
[19:42:53] + News From Folding@Home: Welcome to Folding@Home
[19:42:53] Loaded queue successfully.
[19:42:53] Connecting to http://171.64.65.63:8080/
[19:42:54] Posted data.
[19:42:54] Initial: 0000; - Receiving payload (expected size: 610205)
[19:42:58] - Downloaded at ~148 kB/s
[19:42:58] - Averaged speed for that direction ~153 kB/s
[19:42:58] + Received work.
[19:42:58] + Closed connections
[19:43:03] 
[19:43:03] + Processing work unit
[19:43:03] Core required: FahCore_a1.exe
[19:43:03] Core found.
[19:43:03] Working on Unit 03 [May 3 19:43:03]
[19:43:03] + Working ...
[19:43:03] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 03 -priority 96 -checkpoint 15 -verbose -lifeline 4313 -version 601'

[19:43:03] 
[19:43:03] *------------------------------*
[19:43:03] Folding@Home Gromacs SMP Core
[19:43:03] Version 1.74 (November 27, 2006)
[19:43:03] 
[19:43:03] Preparing to commence simulation
[19:43:03] - Ensuring status. Please wait.
[19:43:20] - Looking at optimizations...
[19:43:20] - Working with standard loops on this execution.
[19:43:20] - Previous termination of core was improper.
[19:43:20] - Going to use standard loops.
[19:43:20] - Files status OK
[19:43:20] arting from initial work packet
[19:43:20] 
[19:43:20] Project: 3064 (Run 3, Clone 165, Gen 8)
[19:43:20] 
[19:43:20] Entering M.D.
[19:43:20] one 165,roject: 3Entering M.D.
[19:43:20] one 165, Gen 8)
Added Code Tags and snipped. -7im
Post Reply