Project: 2677 (Run 9, Clone 26, Gen 25)

Moderators: Site Moderators, FAHC Science Team

Post Reply
BrokenWolf
Posts: 126
Joined: Sat Aug 02, 2008 3:08 am

Project: 2677 (Run 9, Clone 26, Gen 25)

Post by BrokenWolf »

Errored out after starting. @ 0%

Water molecule starting @ atom 46051 can not be settled.
Water molecule starting @ atom 129988 can not be settled.
Water molecule starting @ atom 63232 can not be settled.

INTERRUPTED

BW
Image
DocJonz
Posts: 243
Joined: Thu Dec 06, 2007 6:31 pm
Hardware configuration: Folding with: 4x RTX 4070Ti, 1x RTX 3070
Location: United Kingdom
Contact:

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by DocJonz »

Had exactly the same problem .... (verbosity not on though)

Code: Select all

[06:14:05] *------------------------------*
[06:14:05] Folding@Home Gromacs SMP Core
[06:14:05] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:14:05] 
[06:14:05] Preparing to commence simulation
[06:14:05] - Ensuring status. Please wait.
[06:14:06] Called DecompressByteArray: compressed_data_size=4843591 data_size=24023789, decompressed_data_size=24023789 diff=0
[06:14:06] - Digital signature verified
[06:14:06] 
[06:14:06] Project: 2677 (Run 9, Clone 26, Gen 25)
[06:14:06] 
[06:14:06] Assembly optimizations on if available.
[06:14:06] Entering M.D.
[06:14:16]  on if available.
[06:14:16] Entering M.D.
[06:14:27]  (0%)
[06:14:28] 
[06:14:28] Folding@home Core Shutdown: INTERRUPTED
[06:14:32] CoreStatus = 66 (102)
[06:14:32] + Shutdown requested by user. Exiting.
Folding@Home Client Shutdown.
Folding Stats (HFM.NET): DocJonz Folding Farm Stats
skinnykid63
Posts: 8
Joined: Mon Jun 23, 2008 2:11 pm

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by skinnykid63 »

Ditto on the problem here, same WU p2677 R9/C26/G25

Code: Select all

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/tyler/folding
Executable: ./fah6
Arguments: -smp -advmethods -verbosity 9 

[14:51:14] - Ask before connecting: No
[14:51:14] - User name: skinnykid63 (Team 11108)
[14:51:14] - User ID: 4E84BFBC013312A2
[14:51:14] - Machine ID: 1
[14:51:14] 
[14:51:14] Work directory not found. Creating...
[14:51:14] Could not open work queue, generating new queue...
[14:51:14] - Autosending finished units... [September 30 14:51:14 UTC]
[14:51:14] Trying to send all finished work units
[14:51:14] + No unsent completed units remaining.
[14:51:14] - Autosend completed
[14:51:14] - Preparing to get new work unit...
[14:51:14] + Attempting to get work packet
[14:51:14] - Will indicate memory of 988 MB
[14:51:14] - Connecting to assignment server
[14:51:14] Connecting to http://assign.stanford.edu:8080/
[14:51:14] Posted data.
[14:51:14] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[14:51:14] + News From Folding@Home: Welcome to Folding@Home
[14:51:14] Loaded queue successfully.
[14:51:14] Connecting to http://171.64.65.56:8080/
[14:51:21] Posted data.
[14:51:21] Initial: 0000; - Receiving payload (expected size: 4844103)
[14:51:27] - Downloaded at ~788 kB/s
[14:51:27] - Averaged speed for that direction ~788 kB/s
[14:51:27] + Received work.
[14:51:27] + Closed connections
[14:51:27] 
[14:51:27] + Processing work unit
[14:51:27] At least 4 processors must be requested.Core required: FahCore_a2.exe
[14:51:27] Core found.
[14:51:27] Working on queue slot 01 [September 30 14:51:27 UTC]
[14:51:27] + Working ...
[14:51:27] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 7 -verbose -lifeline 3749 -version 624'

[14:51:27] 
[14:51:27] *------------------------------*
[14:51:27] Folding@Home Gromacs SMP Core
[14:51:27] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[14:51:27] 
[14:51:27] Preparing to commence simulation
[14:51:27] - Ensuring status. Please wait.
[14:51:28] Called DecompressByteArray: compressed_data_size=4843591 data_size=24023789, decompressed_data_size=24023789 diff=0
[14:51:28] - Digital signature verified
[14:51:28] 
[14:51:28] Project: 2677 (Run 9, Clone 26, Gen 25)
[14:51:28] 
[14:51:28] Assembly optimizations on if available.
[14:51:28] Entering M.D.
[14:51:38] (Run 9, Clone 26, Gen 25)
[14:51:38] 
[14:51:38] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=tyler-desktop
NNODES=4, MYRANK=2, HOSTNAME=tyler-desktop
NODEID=0 argc=20
NNODES=4, MYRANK=1, HOSTNAME=tyler-desktop
NODEID=1 argc=20
NNODES=4, MYRANK=3, HOSTNAME=tyler-desktop
NODEID=2 argc=20
Reading file work/wudata_01.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=3 argc=20
Note: tpx file_version 48, software version 68

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun 'IBX in water'
6500000 steps,  13000.0 ps (continuing from step 6250000,  12500.0 ps).

t = 12500.001 ps: Water molecule starting at atom 46051 can not be settled.
Check for bad contacts and/or reduce the timestep.

Step 6250001, time 12500 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.023458, max 1.592038 (between atoms 1119 and 1121)
bonds that rotated more than 90 degrees:
 atom 1 atom 2  angle  previous, current, constraint length

t = 12500.003 ps: Water molecule starting at atom 129988 can not be settled.
Check for bad contacts and/or reduce the timestep.

t = 12500.003 ps: Water molecule starting at atom 63232 can not be settled.
Check for bad contacts and/or reduce the timestep.
[14:51:53] lding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Quit
[14:51:57] CoreStatus = 66 (102)
[14:51:57] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[14:51:57] Killing all core threads

Folding@Home Client Shutdown.


Gary480six
Posts: 91
Joined: Mon Jan 21, 2008 6:42 pm

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by Gary480six »

Come On........

I'm getting REALLY tired of Dumping out my Work Folder and Queue.DAT file trying to make this GO AWAY. Please FIX it or PULL it - I don't care which.

Gary
susato
Site Moderator
Posts: 511
Joined: Fri Nov 30, 2007 4:57 am
Location: Team MacOSX
Contact:

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by susato »

Gary - If you can, find the queue position of the bad wu. It will be xx, where xx is 00 through 09. Then stop the client, and then run it with the -delete xx flag. The client will purge the bad WU and stop. Then when you start it up again, it should download a different work unit.

I've PM'd the researcher about this work unit. 5 reports is enough to know it's faulty.
Thanks, everyone, for reporting the problem.
tonic
Posts: 42
Joined: Sat Aug 02, 2008 4:05 am
Location: Seattle, WA

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by tonic »

Same WU, same issue at 0%.
Image
tonic
Posts: 42
Joined: Sat Aug 02, 2008 4:05 am
Location: Seattle, WA

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by tonic »

How do I get rid of this unit? I deleted queue.dat & the work folder but it continues to redownload the exact same broken unit...
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 2677 (Run 9, Clone 26, Gen 25)

Post by bruce »

The server will download the same WU several times before it will assign a new WU.

You have two choices. Repeatedly delete the WU until you get a new one, or change your MachineID to an unused value.
Post Reply