Project: 6904 Run 1, Clone 25, Gen. 48

Moderators: Site Moderators, FAHC Science Team

Post Reply
Adak
Posts: 92
Joined: Fri Dec 07, 2007 10:00 pm

Project: 6904 Run 1, Clone 25, Gen. 48

Post by Adak »

I have this on an AMD 4 processor SuperMicro board, and keep getting the error below. I folded Run 1, Clone 24, Gen. 47, with no problems.

Code: Select all

[05:41:50] + Closed connections
[05:41:50] 
[05:41:50] + Processing work unit
[05:41:50] Core required: FahCore_a5.exe
[05:41:50] Core found.
[05:41:50] Working on queue slot 04 [February 1 05:41:50 UTC]
[05:41:50] + Working ...
[05:41:50] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -verbose -lifeline 2324 -version 634'

[05:41:50] 
[05:41:50] *------------------------------*
[05:41:50] Folding@Home Gromacs SMP Core
[05:41:50] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[05:41:50] 
[05:41:50] Preparing to commence simulation
[05:41:50] - Looking at optimizations...
[05:41:50] - Created dyn
[05:41:50] - Files status OK
[05:41:57] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[05:41:57] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[05:41:58] - Digital signature verified
[05:41:58] 
[05:41:58] Project: 6904 (Run 1, Clone 25, Gen 48)
[05:41:58] 
[05:41:58] Assembly optimizations on if available.
[05:41:58] Entering M.D.
[05:42:07] Mapping NT from 64 to 64 
[05:42:15] Completed 0 out of 250000 steps  (0%)
[06:04:37] Completed 2500 out of 250000 steps  (1%)
[06:15:33]  NT from 64 to 64 
[06:15:59] fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:15:59] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.mdrun returned 3
[06:16:00] Gromacs detected an fcSaveRestoreState: I/O failed d..ir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=00000000028CA7F0, varsize=21120
[06:16:00] Can't restore state.Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Resuming Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:00] Can't open checkpoint file 
[06:16:10] 
[06:16:10] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:16:10] CoreStatus = 62 (98)
[06:16:10] + Restarting core (settings changed) 
[06:16:10] 
[06:16:10] + Processing work unit
[06:16:10] Core required: FahCore_a5.exe
[06:16:10] Core found.
[06:16:10] Working on queue slot 04 [February 1 06:16:10 UTC]
[06:16:10] + Working ...
[06:16:10] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[06:16:10] 
[06:16:10] *------------------------------*
[06:16:10] Folding@Home Gromacs SMP Core
[06:16:10] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[06:16:10] 
[06:16:10] Preparing to commence simulation
[06:16:10] - Looking at optimizations...
[06:16:10] - Not checking prior termination.
[06:16:18] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[06:16:18] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[06:16:19] - Digital signature verified
[06:16:19] 
[06:16:19] Project: 6904 (Run 1, Clone 25, Gen 48)
[06:16:19] 
[06:16:19] Assembly optimizations on if available.
[06:16:19] Entering M.D.
[06:16:27] Mapping NT from 64 to 64 
[06:16:36] Completed 0 out of 250000 steps  (0%)
[06:39:00] Completed 2500 out of 250000 steps  (1%)
[06:50:15] /O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.Resuming from checkpoint
[06:50:15] fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.mdrun returned 3
[06:50:15] Gromacs detected an invafcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000141B5B0, varsize=21120
[06:50:15] Can't restore state.Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:15] Can't open checkpoint file 
[06:50:26] 
[06:50:26] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:50:26] CoreStatus = 62 (98)
[06:50:26] + Restarting core (settings changed) 
[06:50:26] 
[06:50:26] + Processing work unit
[06:50:26] Core required: FahCore_a5.exe
[06:50:26] Core found.
[06:50:26] Working on queue slot 04 [February 1 06:50:26 UTC]
[06:50:26] + Working ...
[06:50:26] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[06:50:26] 
[06:50:26] *------------------------------*
[06:50:26] Folding@Home Gromacs SMP Core
[06:50:26] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[06:50:26] 
[06:50:26] Preparing to commence simulation
[06:50:26] - Looking at optimizations...
[06:50:26] - Not checking prior termination.
[06:50:34] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[06:50:34] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[06:50:35] - Digital signature verified
[06:50:35] 
[06:50:35] Project: 6904 (Run 1, Clone 25, Gen 48)
[06:50:35] 
[06:50:35] Assembly optimizations on if available.
[06:50:35] Entering M.D.
[06:50:43] Mapping NT from 64 to 64 
[06:50:52] Completed 0 out of 250000 steps  (0%)
[07:13:13] Completed 2500 out of 250000 steps  (1%)
[07:24:30] /O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.mdrun returned 3
[07:24:30] Gromacs detected an invalid checkpoint.  RestartifcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000105A490, varsize=21120
[07:24:30] Can't restore state.Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] ResumiCan't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:30] Can't open checkpoint file 
[07:24:41] 
[07:24:41] Folding@home Core Shutdown: UNKNOWN_ERROR
[07:24:42] CoreStatus = 62 (98)
[07:24:42] + Restarting core (settings changed) 
[07:24:42] 
[07:24:42] + Processing work unit
[07:24:42] Core required: FahCore_a5.exe
[07:24:42] Core found.
[07:24:42] Working on queue slot 04 [February 1 07:24:42 UTC]
[07:24:42] + Working ...
[07:24:42] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[07:24:42] 
[07:24:42] *------------------------------*
[07:24:42] Folding@Home Gromacs SMP Core
[07:24:42] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[07:24:42] 
[07:24:42] Preparing to commence simulation
[07:24:42] - Looking at optimizations...
[07:24:42] - Not checking prior termination.
[07:24:49] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[07:24:49] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[07:24:50] - Digital signature verified
[07:24:50] 
[07:24:50] Project: 6904 (Run 1, Clone 25, Gen 48)
[07:24:50] 
[07:24:50] Assembly optimizations on if available.
[07:24:50] Entering M.D.
[07:24:59] Mapping NT from 64 to 64 
[07:25:07] Completed 0 out of 250000 steps  (0%)
[07:47:33] Completed 2500 out of 250000 steps  (1%)
[07:58:45] /O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.mdrun returned 3
[07:58:45] Gromacs detected an invfcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000163D7F0, varsize=21120
[07:58:45] Can't restore state.Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] ResumiCan't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:45] Can't open checkpoint file 
[07:58:56] 
[07:58:56] Folding@home Core Shutdown: UNKNOWN_ERROR
[07:58:57] CoreStatus = 62 (98)
[07:58:57] + Restarting core (settings changed) 
[07:58:57] 
[07:58:57] + Processing work unit
[07:58:57] Core required: FahCore_a5.exe
[07:58:57] Core found.
[07:58:57] Working on queue slot 04 [February 1 07:58:57 UTC]
[07:58:57] + Working ...
[07:58:57] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 2324 -version 634'

[07:58:57] 
[07:58:57] *------------------------------*
[07:58:57] Folding@Home Gromacs SMP Core
[07:58:57] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[07:58:57] 
[07:58:57] Preparing to commence simulation
[07:58:57] - Looking at optimizations...
[07:58:57] - Not checking prior termination.
[07:59:05] - Expanded 57219017 -> 71843392 (decompressed 50.4 percent)
[07:59:05] Called DecompressByteArray: compressed_data_size=57219017 data_size=71843392, decompressed_data_size=71843392 diff=0
[07:59:06] - Digital signature verified
[07:59:06] 
[07:59:06] Project: 6904 (Run 1, Clone 25, Gen 48)
[07:59:06] 
[07:59:06] Assembly optimizations on if available.
[07:59:06] Entering M.D.
[07:59:14] Mapping NT from 64 to 64 
[07:59:22] Completed 0 out of 250000 steps  (0%)
[08:21:45] Completed 2500 out of 250000 steps  (1%)
[08:33:30] /O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
[08:33:30] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000111D7F0, varsize=21120
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by Grandpa_01 »

I am guessing you shut it down during the run and when you started it back up you are getting this error. The computer probably got shut down before it got a chance to completely save the file. I have seen this on my 4P but I always save a copy of the FAH folder to the desktop before shutting down. Fortunately I was able to restart and complete the WU from the file I saved to the desktop.
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
Adak
Posts: 92
Joined: Fri Dec 07, 2007 10:00 pm

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by Adak »

No shut down was made. This box runs headless, and I'm sitting just a few feet away, so there's no reboot or power loss.

Here's the next work unit, doing the same thing. I'm going to re-install Ubuntu and FAH. I thought it might be the drive acting up (it's not new), but it passed a HD test, just now. Maybe "the Kraken" is cracking up, for some reason - I don't know why the FAH program would try to go back and open the checkpoint at all, when it hasn't been turned off.

Code: Select all

--- Opening Log file [February 2 03:55:24 UTC] 

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/adak/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[03:55:24] - Ask before connecting: No
[03:55:24] - User name: Adak (Team 32)
[03:55:24] - User ID not found locally
[03:55:24] + Requesting User ID from server
[03:55:24] - Getting ID from AS: 
[03:55:24] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: 8D4B; - Received User ID = 4B8DB6BC66222BE1
[03:55:25] - Machine ID: 1
[03:55:25] 
[03:55:25] Work directory not found. Creating...
[03:55:25] Could not open work queue, generating new queue...
[03:55:25] - Preparing to get new work unit...
[03:55:25] - Autosending finished units... [February 2 03:55:25 UTC]
[03:55:25] Cleaning up work directory
[03:55:25] Trying to send all finished work units
[03:55:25] + No unsent completed units remaining.
[03:55:25] - Autosend completed
[03:55:25] + Attempting to get work packet
[03:55:25] Passkey found
[03:55:25] - Will indicate memory of 32233 MB
[03:55:25] - Connecting to assignment server
[03:55:25] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: ED82; - Successful: assigned to (130.237.232.237).
[03:55:25] + News From Folding@Home: Welcome to Folding@Home
[03:55:25] Loaded queue successfully.
[03:55:25] Sent data
[03:55:25] Connecting to http://130.237.232.237:8080/
[03:55:38] Posted data.
[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
[03:59:48] 
[03:59:48] + Processing work unit
[03:59:48] Core required: FahCore_a5.exe
[03:59:48] Core found.
[03:59:48] Working on queue slot 01 [February 2 03:59:48 UTC]
[03:59:48] + Working ...
[03:59:48] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -verbose -lifeline 9786 -version 634'

[03:59:48] 
[03:59:48] *------------------------------*
[03:59:48] Folding@Home Gromacs SMP Core
[03:59:48] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[03:59:48] 
[03:59:48] Preparing to commence simulation
[03:59:48] - Looking at optimizations...
[03:59:48] - Created dyn
[03:59:48] - Files status OK
[03:59:56] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[03:59:56] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[03:59:56] - Digital signature verified
[03:59:56] 
[03:59:56] Project: 6903 (Run 2, Clone 29, Gen 5)
[03:59:56] 
[03:59:57] Assembly optimizations on if available.
[03:59:57] Entering M.D.
[04:00:05] Mapping NT from 64 to 64 
[04:00:13] Completed 0 out of 250000 steps  (0%)
[04:16:08] Completed 2500 out of 250000 steps  (1%)
[04:32:31]  NT from 64 to 64 
[04:33:01] fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.mdrun returned 3
[04:33:01] Gromacs detectedfcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] ReCan't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:11] 
[04:33:11] Folding@home Core Shutdown: UNKNOWN_ERROR
[04:33:12] CoreStatus = 62 (98)
[04:33:12] + Restarting core (settings changed) 
[04:33:12] 
[04:33:12] + Processing work unit
[04:33:12] Core required: FahCore_a5.exe
[04:33:12] Core found.
[04:33:12] Working on queue slot 01 [February 2 04:33:12 UTC]
[04:33:12] + Working ...
[04:33:12] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 9786 -version 634'

[04:33:12] 
[04:33:12] *------------------------------*
[04:33:12] Folding@Home Gromacs SMP Core
[04:33:12] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[04:33:12] 
[04:33:12] Preparing to commence simulation
[04:33:12] - Looking at optimizations...
[04:33:12] - Not checking prior termination.
[04:33:19] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[04:33:19] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[04:33:20] - Digital signature verified
[04:33:20] 
[04:33:20] Project: 6903 (Run 2, Clone 29, Gen 5)
[04:33:20] 
[04:33:20] Assembly optimizations on if available.
[04:33:20] Entering M.D.
[04:33:29] Mapping NT from 64 to 64 
[04:33:36] Completed 0 out of 250000 steps  (0%)
[04:49:34] Completed 2500 out of 250000 steps  (1%)
[05:07:01] /O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
One thing that stands out to me is this:

Code: Select all

[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
I'm on DSL service, and I can't begin to receive a 57 MB file in just ten seconds! And isn't 57 MB a bit large for a wu download?
Leonardo
Posts: 261
Joined: Tue Dec 04, 2007 5:09 am
Hardware configuration: GPU slots on home-built, purpose-built PCs.
Location: Eagle River, Alaska

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by Leonardo »

If it's not a work unit problem, my first guess would be physical memory errors.
Image
Adak
Posts: 92
Joined: Fri Dec 07, 2007 10:00 pm

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by Adak »

Thanks, but the problem is with "the Kraken". I thought the same thing was likely, and ran some tests, but memory was fine.

This is what the Kraken does, according to Tear:

Code: Select all

Here's expected behaviour in the use case of interest:
1. Client downloads new WU (no checkpoint exists at this time)
2. Client starts the FahCore (no checkpoint exists at this time)
3. FahCore keeps on simulating
4. First checkpoint interval elapses (15 minutes in default config)
5. The Kraken detects completely written checkpoint
6. The Kraken restarts the FahCore
7. FahCore resumes from checkpoint
8. Hopefully, DLB gets engaged  (DLB is Dynamic Load Balancing)
And this is the problem, as Tear described it for Fedora:

Code: Select all

Upon installation a dedicated user gets created (called fahclient) and client
starts automatically (runs as 'fahclient'). It runs off /var/lib/fahclient/
directory. WUs, config files, FahCores, everything is kept there.

If you, however, at any time, stop the client (via 'service FAHClient stop')
and then start FAHControl from your terminal window or otherwise (while being
logged in as yourself) here's what's going to happen:

1) FAHControl will try to connect to FAHClient and will fail (client is not running)
2) FAHControl will start the client (that's what its default configuration is)
3) But note, you're not 'fahclient' anymore!
4) FAHClient doesn't even look at files in /var/lib/fahclient/ (no wonder, it
   can't access them anyway) and
5) creates fresh 'installation' in '.FAHClient' :!:
6) As it's a fresh installation, client will get a fresh WU, fresh FahCore, etc.
7) Bottom-line: you'll end up with two separate and independent installations:
   one in /var/lib/fahclient/ and another one in ~/.FAHClient

To prevent this issue from arising I'd recommend turning 'Autostart' feature
off in 'Preferences' and always running FAHClient as 'fahclient' user (I guess
one could call it a 'service mode').
Most of the above is true for Ubuntu, as well.

So of course, this wasn't in the summary I read to install the Kraken, so I'm gleefully testing various BIOS options, while folding - and of course, stopping and restarting the folding client - and eventually, I got caught with my user name, instead of "fahclient".

That's when this whole mess started, but I thought I had worked it out - but not so, since I knew none of this about the Kraken.

So the wu is probably fine - almost 100% sure it is.

Thanks for all the input however. In Linux, I can get off into the weeds, rather quickly.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by PantherX »

Nothing in the WU Database so I have marked it for followup.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
sortofageek
Site Admin
Posts: 3111
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by sortofageek »

My apologies for such a late follow-up but, just for the record, this one was completed successfully by two different donors.
Nathan_P
Posts: 1180
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: Project: 6904 Run 1, Clone 25, Gen. 48

Post by Nathan_P »

Adak wrote:No shut down was made. This box runs headless, and I'm sitting just a few feet away, so there's no reboot or power loss.

Here's the next work unit, doing the same thing. I'm going to re-install Ubuntu and FAH. I thought it might be the drive acting up (it's not new), but it passed a HD test, just now. Maybe "the Kraken" is cracking up, for some reason - I don't know why the FAH program would try to go back and open the checkpoint at all, when it hasn't been turned off.

Code: Select all

--- Opening Log file [February 2 03:55:24 UTC] 

# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/adak/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[03:55:24] - Ask before connecting: No
[03:55:24] - User name: Adak (Team 32)
[03:55:24] - User ID not found locally
[03:55:24] + Requesting User ID from server
[03:55:24] - Getting ID from AS: 
[03:55:24] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: 8D4B; - Received User ID = 4B8DB6BC66222BE1
[03:55:25] - Machine ID: 1
[03:55:25] 
[03:55:25] Work directory not found. Creating...
[03:55:25] Could not open work queue, generating new queue...
[03:55:25] - Preparing to get new work unit...
[03:55:25] - Autosending finished units... [February 2 03:55:25 UTC]
[03:55:25] Cleaning up work directory
[03:55:25] Trying to send all finished work units
[03:55:25] + No unsent completed units remaining.
[03:55:25] - Autosend completed
[03:55:25] + Attempting to get work packet
[03:55:25] Passkey found
[03:55:25] - Will indicate memory of 32233 MB
[03:55:25] - Connecting to assignment server
[03:55:25] Connecting to http://assign.stanford.edu:8080/
[03:55:25] Posted data.
[03:55:25] Initial: ED82; - Successful: assigned to (130.237.232.237).
[03:55:25] + News From Folding@Home: Welcome to Folding@Home
[03:55:25] Loaded queue successfully.
[03:55:25] Sent data
[03:55:25] Connecting to http://130.237.232.237:8080/
[03:55:38] Posted data.
[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
[03:59:48] 
[03:59:48] + Processing work unit
[03:59:48] Core required: FahCore_a5.exe
[03:59:48] Core found.
[03:59:48] Working on queue slot 01 [February 2 03:59:48 UTC]
[03:59:48] + Working ...
[03:59:48] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -verbose -lifeline 9786 -version 634'

[03:59:48] 
[03:59:48] *------------------------------*
[03:59:48] Folding@Home Gromacs SMP Core
[03:59:48] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[03:59:48] 
[03:59:48] Preparing to commence simulation
[03:59:48] - Looking at optimizations...
[03:59:48] - Created dyn
[03:59:48] - Files status OK
[03:59:56] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[03:59:56] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[03:59:56] - Digital signature verified
[03:59:56] 
[03:59:56] Project: 6903 (Run 2, Clone 29, Gen 5)
[03:59:56] 
[03:59:57] Assembly optimizations on if available.
[03:59:57] Entering M.D.
[04:00:05] Mapping NT from 64 to 64 
[04:00:13] Completed 0 out of 250000 steps  (0%)
[04:16:08] Completed 2500 out of 250000 steps  (1%)
[04:32:31]  NT from 64 to 64 
[04:33:01] fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.mdrun returned 3
[04:33:01] Gromacs detectedfcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000270A7F0, varsize=21120
[04:33:01] Can't restore state.Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] ReCan't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:01] Can't open checkpoint file 
[04:33:11] 
[04:33:11] Folding@home Core Shutdown: UNKNOWN_ERROR
[04:33:12] CoreStatus = 62 (98)
[04:33:12] + Restarting core (settings changed) 
[04:33:12] 
[04:33:12] + Processing work unit
[04:33:12] Core required: FahCore_a5.exe
[04:33:12] Core found.
[04:33:12] Working on queue slot 01 [February 2 04:33:12 UTC]
[04:33:12] + Working ...
[04:33:12] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 01 -np 64 -checkpoint 30 -notermcheck -verbose -lifeline 9786 -version 634'

[04:33:12] 
[04:33:12] *------------------------------*
[04:33:12] Folding@Home Gromacs SMP Core
[04:33:12] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[04:33:12] 
[04:33:12] Preparing to commence simulation
[04:33:12] - Looking at optimizations...
[04:33:12] - Not checking prior termination.
[04:33:19] - Expanded 57245343 -> 71846524 (decompressed 50.4 percent)
[04:33:19] Called DecompressByteArray: compressed_data_size=57245343 data_size=71846524, decompressed_data_size=71846524 diff=0
[04:33:20] - Digital signature verified
[04:33:20] 
[04:33:20] Project: 6903 (Run 2, Clone 29, Gen 5)
[04:33:20] 
[04:33:20] Assembly optimizations on if available.
[04:33:20] Entering M.D.
[04:33:29] Mapping NT from 64 to 64 
[04:33:36] Completed 0 out of 250000 steps  (0%)
[04:49:34] Completed 2500 out of 250000 steps  (1%)
[05:07:01] /O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
[05:07:01] Can't restore state.fcSaveRestoreState: I/O failed dir=0, var=000000000219E7F0, varsize=21120
One thing that stands out to me is this:

Code: Select all

[03:55:38] Initial: 0000; - Receiving payload (expected size: 57245855)
[03:59:48] - Downloaded at ~223 kB/s
[03:59:48] - Averaged speed for that direction ~223 kB/s
[03:59:48] + Received work.
[03:59:48] + Closed connections
I'm on DSL service, and I can't begin to receive a 57 MB file in just ten seconds! And isn't 57 MB a bit large for a wu download?
Just as a further follow up as i missed it originally, 57MB is the size of the 6904 WU, 8101 runs about 35MB but i can't remember the others, needless to say these are at the top end of size and go hand in hand with the -bigadv status
Image
Post Reply