Moderators: Site Moderators, FAHC Science Team
_ikki_ wrote:It crashed 3 time at the same point (59%)
[10:51:13] Completed 2900000 out of 5000000 steps (58 percent)
[11:00:31] Writing local files
[11:00:31] Completed 2950000 out of 5000000 steps (59 percent)
[11:08:25] Warning: long 1-4 interactions
[11:08:29] CoreStatus = 1 (1)
[11:08:29] Client-core communications error: ERROR 0x1
[11:08:29] Deleting current work unit & continuing...
[11:12:51] - Warning: Could not delete all work unit files (7): Core returned invalid code
[11:12:51] Trying to send all finished work units
[11:12:51] + No unsent completed units remaining.
[11:12:51] - Preparing to get new work unit...
[11:12:51] + Attempting to get work packet
[11:12:51] - Will indicate memory of 2014 MB
[17:53:44] Working on Unit 07 [December 23 17:53:44]
[17:53:44] + Working ...
[17:53:44] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 07 -checkpoint 15 -forceasm -verbose -lifeline 20217 -version 600'
[17:53:44]
[17:53:44] *------------------------------*
[17:53:44] Folding@Home Gromacs SMP Core
[17:53:44] Version 1.74 (November 27, 2006)
[17:53:44]
[17:53:44] Preparing to commence simulation
[17:53:44] - Ensuring status. Please wait.
[17:54:01] - Assembly optimizations manually forced on.
[17:54:01] - Not checking prior termination.
[17:54:01] - Expanded 607662 -> 3257309 (decompressed 536.0 percent)
[17:54:01]
[17:54:01] Project: 3062 (Run 3, Clone 26, Gen 1)
[17:54:01]
[17:54:01] Assembly optimizations on if available.
[17:54:01] Entering M.D.
[17:54:07] Calling FAH init
[17:54:07] mbda5_99sb
[17:54:07] Writing local files
[17:54:07] Completed 2900000 out of 5000000 steps (58 percent)
[17:54:07] Extra SSE boost OK.
[17:54:07]
[17:54:07] Completed 2900000 out of 5000000 steps (58 percent)
[17:54:07] Extra SSE boost OK.
[18:02:07] Writing local files
[18:02:08] Completed 2950000 out of 5000000 steps (59 percent)
[18:10:16] Writing local files
[18:10:16] Completed 3000000 out of 5000000 steps (60 percent)
[18:18:21] Writing local files
[18:18:21] Completed 3050000 out of 5000000 steps (61 percent)
[18:26:20] Writing local files
[18:26:20] Completed 3100000 out of 5000000 steps (62 percent)
Someone else has sucessfully finished this WU.
_ikki_ wrote:The next time I'll try to restart the client if I haven't passed the deadline but It implies to inspect regularly the log file...
Let ask you several questions to conclude :
- Should we warn Stanford if a WU has crashed and if we succeed (after restarting the client) to complete it ?
- What do we do if the deadline has passed and the WU crashed ? Should we report it ?
For the purposes of learning more about the protein, itself, finishing the project is important. For the purposes of debugging, finding a repeatable error is important.ChelseaOilman wrote:After changing the date on my computer, because the deadline had passed, I was able to run the WU past frame 59 as well. If someone else hadn't already submitted it for credit I would keep going to the end.
bruce wrote:
@ _ikki_
If you disable local deadlines and restart the WU that you published, does it fail for you (indicating a possible hardware issue) or does it continue like it did for ChelseaOilman (indicating that restarting changed something)
_ikki_ wrote:how to disable local deadlines without modifying the date ?
Return to Issues with a specific WU
Users browsing this forum: No registered users and 2 guests