Truly Wonkey WU - Project: 2609 (Run 0, Clone 385, Gen 0)

Moderators: Site Moderators, FAHC Science Team

Post Reply
klasseng
Posts: 125
Joined: Thu Dec 27, 2007 6:08 am
Hardware configuration: System #1, Quad GPU:
Motherboard: Asus Rampage IV Extreme
CPU: 6 Core Intel i7 (3930K)
GPU: 4 X NVIDIA GForce GTS 450
OS: WIndows 7 Home Premium, 64-bit
RAM: 16GB

System #2:
MacPro 2,1 (Early 2007)
Dual Quad-Core Intel Xeon 3GHz (X5365)
9GB Memory
OS: Mac OS X 10.7.5
GPU: N/A
Location: Canada

Truly Wonkey WU - Project: 2609 (Run 0, Clone 385, Gen 0)

Post by klasseng »

I just got another weird WU. I've completed about 300 SMP WU on my Dual Quad-Core MacPro.

Now Project: 2609 (Run 0, Clone 385, Gen 0) stops before it starts

Here's the log:

Code: Select all

[05:00:04] + Attempting to get work packet
[05:00:04] - Will indicate memory of 4000 MB
[05:00:04] - Connecting to assignment server
[05:00:04] Connecting to http://assign.stanford.edu:8080/
[05:00:04] Posted data.
[05:00:04] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[05:00:04] + News From Folding@Home: Welcome to Folding@Home
[05:00:04] Loaded queue successfully.
[05:00:04] Connecting to http://171.64.65.56:8080/
[05:00:09] Posted data.
[05:00:09] Initial: 0000; - Receiving payload (expected size: 2756994)
[05:00:22] - Downloaded at ~207 kB/s
[05:00:22] - Averaged speed for that direction ~186 kB/s
[05:00:22] + Received work.
[05:00:22] Trying to send all finished work units
[05:00:22] + No unsent completed units remaining.
[05:00:22] + Closed connections
[05:00:22] 
[05:00:22] + Processing work unit
[05:00:22] Core required: FahCore_a1.exe
[05:00:22] Core found.
[05:00:22] Working on Unit 00 [January 5 05:00:22]
[05:00:22] + Working ...
[05:00:22] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 00 -checkpoint 15 -verbose -lifeline 18305 -version 600'

[05:00:22] 
[05:00:22] *------------------------------*
[05:00:22] Folding@Home Gromacs SMP Core
[05:00:22] Version 1.74 (September 24, 2007)
[05:00:22] 
[05:00:22] Preparing to commence simulation
[05:00:22] - Ensuring status. Please wait.
[05:00:23] - Starting from initial work packet
[05:00:23] 
[05:00:23] Project: 2609 (Run 0, Clone 385, Gen 0)
[05:00:23] 
[05:00:23] Assembly optimizations on if available.
[05:00:23] Entering M.D.
[05:00:40] ial work pa- Sta
[05:00:40] Project: 2609 (Run 0, Clone 38
[05:00:40] Project: 2609 (Run 0, Clone 385, Gen 0)
[05:00:40] 
[05:00:41] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=klasseng.local
NNODES=4, MYRANK=1, HOSTNAME=klasseng.local
NNODES=4, MYRANK=2, HOSTNAME=klasseng.local
NNODES=4, MYRANK=3, HOSTNAME=klasseng.local
NODEID=3 argc=15
NODEID=0 argc=15
NODEID=2 argc=15
NODEID=1 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit http://www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[05:00:47] mdrunner cpfilename: 
run input file work/wudata_00.xyz was made for 1 nodes,
             while Core_A1.exe expected it to be for 4 nodes.: ENOLINK (Reserved)
Error on node 0, will try to stop all the nodes
[05:00:48] Finalizing output
[0]0:Return code = 97
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[05:00:53] CoreStatus = 61 (97)
[05:00:53] + Client running with incorrect SMP settings for work unit.  Please check settings and restart
Had to kill it with control-c

I then did a run with configonly and reduced the memory down to 2000.

I started up with:
fah6 -local -smp -advmethods -verbosity 9
I don't usually put in the advmethods flag.
The WU crashed out with identical results.

I've moved the "work" folder to an archive and will start up FAH6 with a fresh start.

I can make the work folder available if someone wants to play with it.

Grant
peace,
Grant
Post Reply