: No such process

Moderators: Site Moderators, PandeGroup

: No such process

Postby hrsetrdr » Mon Apr 02, 2012 8:30 pm

This is not a new problem, and happens to enough users that it deserves a solution. Take a look at the log entries generated upon restart of the client:

Code: Select all
--- Opening Log file [April 2 19:04:58 UTC]


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/hrsetrdr/fah
Executable: ./fah6
Arguments: -smp 8 -bigbeta

[19:04:58] - Ask before connecting: No
[19:04:58] - User name: hrsetrdr (Team 32)
[19:04:58] - User ID: 30FE0ECF0BC078B0
[19:04:58] - Machine ID: 1
[19:04:58]
[19:04:58] Loaded queue successfully.
[19:04:58]
[19:04:58] + Processing work unit
[19:04:58] Core required: FahCore_a5.exe
[19:04:58] Core found.
[19:04:58] Working on queue slot 05 [April 2 19:04:58 UTC]
[19:04:58] + Working ...
[19:04:58]
[19:04:58] *------------------------------*
[19:04:58] Folding@Home Gromacs SMP Core
[19:04:58] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[19:04:58]
[19:04:58] Preparing to commence simulation
[19:04:58] - Looking at optimizations...
[19:04:58] - Files status OK
[19:05:00] - Expanded 24865387 -> 30796292 (decompressed 123.8 percent)
[19:05:00] Called DecompressByteArray: compressed_data_size=24865387 data_size=30796292, decompressed_data_size=30796292 diff=0
[19:05:00] - Digital signature verified
[19:05:00]
[19:05:00] Project: 6900 (Run 3, Clone 6, Gen 140)
[19:05:00]
[19:05:00] Assembly optimizations on if available.
[19:05:00] Entering M.D.
[19:05:06] Using Gromacs checkpoints
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                            :-)  VERSION 4.5.3  (-:

        Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
      Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
        Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
           Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz,
                Michael Shirts, Alfons Sijbers, Peter Tieleman,

               Berk Hess, David van der Spoel, and Erik Lindahl.

       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
            Copyright (c) 2001-2010, The GROMACS development team at
        Uppsala University & The Royal Institute of Technology, Sweden.
            check out http://www.gromacs.org for more information.


                               :-)  Gromacs  (-:

[19:05:08] Mapping NT from 8 to 8
Reading file work/wudata_05.tpr, VERSION 4.0.99_development_20090605 (single precision)
Note: tpx file_version 70, software version 73
Starting 8 threads

Reading checkpoint file work/wudata_05.cpt generated: Mon Apr  2 11:05:01 2012


Making 1D domain decomposition 8 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'

-------------------------------------------------------
Program Gromacs, VERSION 4.5.3
Source code file: /vspm58/VM/fah-converted/mnt/fah_windows_build/LinuxBuilds/gromacs-4.5.3/src/kernel/md.c, line: 1539

Fatal error:
Checkpoint error on step 0

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

Thanx for Using GROMACS - Have a Nice Day
: No such process
[19:05:11] fcSaveRestoreState: I/O failed dir=0, var=00007F10760338E0, varsize=20
[19:05:11] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
[19:05:11] fcSaveRestoreState: I/O failed dir=0, var=00007F10770358E0, varsize=20
[19:05:11] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
[19:05:11] fcSaveRestoreState: I/O failed dir=0, var=00007F10748308E0, varsize=20
[19:05:11] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
35250000 steps, 141000.0 ps (continuing from step 35034960, 140139.8 ps).
[19:05:12] fcSaveRestoreState: I/O failed dir=0, var=00007F106FFFB8E0, varsize=20
[19:05:12] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
[19:05:12] fcSaveRestoreState: I/O failed dir=0, var=00007F10750318E0, varsize=20
[19:05:12] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
[19:05:12] mdrun returned 3
[19:05:12] Gromacs detected an invalid checkpoint.  Restarting...fcSaveRestoreState: I/O failed dir=0, var=00007F10758328E0, varsize=20
[19:05:12] fcCheckPointResume: failure in call to fcSaveRestoreState() to restore cpt hash.
[19:05:12] Can't open checkpoint file
[19:05:12] Resuming from checkpoint
[19:05:12] Can't open checkpoint file


I know that some users use a workaround, by making a complete copy of their folding directory as a backup, but this is ONLY a workaround and NOT a solution.

What kind of problem is this? Is it a :

1. file system problem? FWIW, I am using ext3 on a Debian Squeeze system

2. improper shutdown problem? FWIW, I always use ctrl-alt-del to shutdown a client

3.Perhaps a flaw in Gromacs, or a flaw in the Linux v6 client? I wouldn't know, am just a user/donor.


Anyone with a solution- please advise.
Folding rig:Supermicro X9DRD-7LN4F-JBOD | (2) Xeon E5-2670 | 128GB DDR3 ECC Registered

Image
User avatar
hrsetrdr
 
Posts: 179
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.

Re: : No such process

Postby dschief » Tue Apr 03, 2012 3:26 pm

Might be wrong, But I was under the impression that CTRL C was the proper shut down command.
have been running multiple Linux SMP boxes for 3+ yrs. & CTRL C is all I've ever used?
User avatar
dschief
 
Posts: 246
Joined: Tue Dec 04, 2007 5:56 am

Re: : No such process

Postby hrsetrdr » Tue Apr 03, 2012 8:44 pm

dschief wrote:Might be wrong, But I was under the impression that CTRL C was the proper shut down command.
have been running multiple Linux SMP boxes for 3+ yrs. & CTRL C is all I've ever used?


Hehe, I spent much of yesterday working on a family members Windows machine, hence the ctrl-alt-del mentality. You are so right that "ctrl c" is the proper shutdown, and is precisely what I meant. :oops:


There are many many threads on this forum regarding this very issue, with no postings that properly define the real issue, at least none that I'm aware of. For now, I am abandoning use of the Linux V6 client and am running the Windows client under Wine, which has never been susceptible to the fcSaveRestoreState I/O failure. In addition, it may be time to check out the Linux V7.x.xx client, testing that as well for stability.
User avatar
hrsetrdr
 
Posts: 179
Joined: Sun Dec 02, 2007 4:29 pm
Location: In the Fold somewhere in SoCal.


Return to Linux CPU V6 Client

Who is online

Users browsing this forum: No registered users and 1 guest

cron