Run: exception thrown in GuardedRun -- Gromacs cannot contin

Moderators: slegrand, Site Moderators, PandeGroup

Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby poyaochuang » Wed Apr 22, 2009 6:22 am

error with 2 incomplete works??

Code: Select all
[19:31:55] Completed 99%
[19:35:38] Timer requesting checkpoint
[19:36:50] Completed 100%
[19:36:51] Successful run
[19:36:51] DynamicWrapper: Finished Work Unit: sleep=10000
[19:37:01] Reserved 78576 bytes for xtc file; Cosm status=0
[19:37:01] Allocated 78576 bytes for xtc file
[19:37:01] - Reading up to 78576 from "work/wudata_01.xtc": Read 78576
[19:37:01] Read 78576 bytes from xtc file; available packet space=786351888
[19:37:01] xtc file hash check passed.
[19:37:01] Reserved 23472 23472 786351888 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[19:37:01] Allocated 23472 bytes for arc file
[19:37:01] - Reading up to 23472 from "work/wudata_01.trr": Read 23472
[19:37:01] Read 23472 bytes from arc file; available packet space=786328416
[19:37:01] trr file hash check passed.
[19:37:01] Allocated 560 bytes for edr file
[19:37:01] Read bedfile
[19:37:01] edr file hash check passed.
[19:37:01] Allocated 10711 bytes for logfile
[19:37:01] Read logfile
[19:37:01] GuardedRun: success in DynamicWrapper
[19:37:01] GuardedRun: done
[19:37:01] Run: GuardedRun completed.
[19:37:03] - Writing 113831 bytes of core data to disk...
[19:37:04] Done: 113319 -> 107238 (compressed to 94.6 percent)
[19:37:04]   ... Done.
[19:37:04] - Shutting down core
[19:37:04]
[19:37:04] Folding@home Core Shutdown: FINISHED_UNIT
[19:37:07] CoreStatus = 64 (100)
[19:37:07] Sending work to server
[19:37:07] Project: 5758 (Run 10, Clone 237, Gen 15)


[19:37:07] + Attempting to send results [April 21 19:37:07 UTC]
[19:37:09] + Results successfully sent
[19:37:09] Thank you for your contribution to Folding@Home.
[19:37:09] + Starting local stats count at 1
[19:37:13] - Preparing to get new work unit...
[19:37:13] + Attempting to get work packet
[19:37:13] - Connecting to assignment server
[19:37:14] - Successful: assigned to (171.64.65.106).
[19:37:14] + News From Folding@Home: GPU folding beta
[19:37:14] Loaded queue successfully.
[19:37:14] + Could not connect to Work Server
[19:37:14] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[19:37:22] + Attempting to get work packet
[19:37:22] - Connecting to assignment server
[19:37:22] - Successful: assigned to (171.64.65.106).
[19:37:22] + News From Folding@Home: GPU folding beta
[19:37:22] Loaded queue successfully.
[19:37:22] + Could not connect to Work Server
[19:37:22] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[19:37:33] + Attempting to get work packet
[19:37:33] - Connecting to assignment server
[19:37:33] - Successful: assigned to (171.64.65.106).
[19:37:33] + News From Folding@Home: GPU folding beta
[19:37:34] Loaded queue successfully.
[19:37:34] + Could not connect to Work Server
[19:37:34] - Attempt #3  to get work failed, and no other work to do.
Waiting before retry.
[19:38:01] + Attempting to get work packet
[19:38:01] - Connecting to assignment server
[19:38:01] - Successful: assigned to (171.64.65.106).
[19:38:01] + News From Folding@Home: GPU folding beta
[19:38:01] Loaded queue successfully.
[19:38:17] + Closed connections
[19:38:17]
[19:38:17] + Processing work unit
[19:38:17] Core required: FahCore_11.exe
[19:38:17] Core found.
[19:38:17] Working on queue slot 02 [April 21 19:38:17 UTC]
[19:38:17] + Working ...
[19:38:17]
[19:38:17] *------------------------------*
[19:38:17] Folding@Home GPU Core - Beta
[19:38:17] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:38:17]
[19:38:17] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[19:38:17] Build host: amoeba
[19:38:17] Board Type: Nvidia
[19:38:17] Core      :
[19:38:17] Preparing to commence simulation
[19:38:17] - Looking at optimizations...
[19:38:17] - Created dyn
[19:38:17] - Files status OK
[19:38:17] - Expanded 66313 -> 348500 (decompressed 525.5 percent)
[19:38:17] Called DecompressByteArray: compressed_data_size=66313 data_size=348500, decompressed_data_size=348500 diff=0
[19:38:17] - Digital signature verified
[19:38:17]
[19:38:17] Project: 5778 (Run 2, Clone 429, Gen 33)
[19:38:17]
[19:38:17] Assembly optimizations on if available.
[19:38:17] Entering M.D.
[19:38:25] Working on Protein
[19:38:29] Client config found, loading data.
[19:38:29] Starting GUI Server
[19:48:14] Completed 1%
[19:58:01] Completed 2%
[20:07:48] Completed 3%
[20:17:34] Completed 4%
[20:27:20] Completed 5%
[20:37:07] Completed 6%
[20:46:53] Completed 7%
[20:56:40] Completed 8%
[21:06:26] Completed 9%
[21:16:13] Completed 10%
[21:25:59] Completed 11%
[21:34:33] Opening http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userpage&username=poyaochuang...
[21:35:46] Completed 12%
[21:45:40] Completed 13%
[21:55:35] Completed 14%
[22:05:21] Completed 15%
[22:15:07] Completed 16%
[22:24:54] Completed 17%
[22:34:40] Completed 18%
[22:44:26] Completed 19%
[22:54:13] Completed 20%
[22:58:14] + Working...
[23:03:59] Completed 21%
[23:09:35] + Paused
[23:11:23] + Working ...
[23:11:23] Suspending work thread...
[23:11:23] Resuming work thread...
[23:15:37] Completed 22%
[23:25:31] Completed 23%
[23:26:24] Timer requesting checkpoint
[23:35:25] Completed 24%
[23:41:24] Timer requesting checkpoint
[23:45:13] Completed 25%
[23:54:59] Completed 26%
[23:56:24] Timer requesting checkpoint
[00:04:46] Completed 27%
[00:11:24] Timer requesting checkpoint
[00:14:32] Completed 28%
[00:24:19] Completed 29%
[00:26:24] Timer requesting checkpoint
[00:34:05] Completed 30%
[00:41:24] Timer requesting checkpoint
[00:43:51] Completed 31%
[00:46:03] Run: exception thrown during GuardedRun
[00:46:03] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[00:46:03] Going to send back what have done -- stepsTotalG=20000000
[00:46:03] Work fraction=0.3122 steps=20000000.
[00:46:07] logfile size=0 infoLength=0 edr=0 trr=23
[00:46:07] - Writing 642 bytes of core data to disk...
[00:46:07] Done: 130 -> 127 (compressed to 97.6 percent)
[00:46:07]   ... Done.
[00:46:08]
[00:46:08] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:46:12] CoreStatus = 7A (122)
[00:46:12] Sending work to server
[00:46:12] Project: 5778 (Run 2, Clone 429, Gen 33)


[00:46:12] + Attempting to send results [April 22 00:46:12 UTC]
[00:46:12] + Results successfully sent
[00:46:12] Thank you for your contribution to Folding@Home.
[00:46:16] - Preparing to get new work unit...
[00:46:16] + Attempting to get work packet
[00:46:16] - Connecting to assignment server
[00:46:16] - Successful: assigned to (171.64.65.106).
[00:46:16] + News From Folding@Home: GPU folding beta
[00:46:17] Loaded queue successfully.
[00:46:17] + Could not connect to Work Server
[00:46:17] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[00:46:33] + Attempting to get work packet
[00:46:33] - Connecting to assignment server
[00:46:33] - Successful: assigned to (171.64.65.106).
[00:46:33] + News From Folding@Home: GPU folding beta
[00:46:33] Loaded queue successfully.
[00:46:34] + Closed connections
[00:46:39]
[00:46:39] + Processing work unit
[00:46:39] Core required: FahCore_11.exe
[00:46:39] Core found.
[00:46:39] Working on queue slot 03 [April 22 00:46:39 UTC]
[00:46:39] + Working ...
[00:46:39]
[00:46:39] *------------------------------*
[00:46:39] Folding@Home GPU Core - Beta
[00:46:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:46:39]
[00:46:39] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[00:46:39] Build host: amoeba
[00:46:39] Board Type: Nvidia
[00:46:39] Core      :
[00:46:39] Preparing to commence simulation
[00:46:39] - Looking at optimizations...
[00:46:39] - Created dyn
[00:46:39] - Files status OK
[00:46:39] - Expanded 70168 -> 360060 (decompressed 513.1 percent)
[00:46:39] Called DecompressByteArray: compressed_data_size=70168 data_size=360060, decompressed_data_size=360060 diff=0
[00:46:39] - Digital signature verified
[00:46:39]
[00:46:39] Project: 5758 (Run 4, Clone 218, Gen 25)
[00:46:39]
[00:46:39] Assembly optimizations on if available.
[00:46:39] Entering M.D.
[00:46:46] Working on Protein
[00:46:50] Client config found, loading data.
[00:46:50] Starting GUI Server
[00:51:45] Completed 1%
[00:56:40] Completed 2%
[01:01:35] Completed 3%
[01:06:30] Completed 4%
[01:11:25] Completed 5%
[01:16:20] Completed 6%
[01:21:15] Completed 7%
[01:26:10] Completed 8%
[01:31:05] Completed 9%
[01:36:00] Completed 10%
[01:40:55] Completed 11%
[01:45:50] Completed 12%
[01:50:45] Completed 13%
[01:55:40] Completed 14%
[02:00:35] Completed 15%
[02:05:30] Completed 16%
[02:10:25] Completed 17%
[02:15:20] Completed 18%
[02:20:15] Completed 19%
[02:25:10] Completed 20%
[02:30:05] Completed 21%
[02:35:00] Completed 22%
[02:39:55] Completed 23%
[02:44:50] Completed 24%
[02:49:45] Completed 25%
[02:54:40] Completed 26%
[02:59:36] Completed 27%
[03:04:31] Completed 28%
[03:09:26] Completed 29%
[03:14:21] Completed 30%
[03:19:16] Completed 31%
[03:24:11] Completed 32%
[03:29:06] Completed 33%
[03:34:01] Completed 34%
[03:38:56] Completed 35%
[03:43:51] Completed 36%
[03:44:08] + Paused

Folding@Home Client Shutdown.


--- Opening Log file [April 22 04:52:13 UTC]


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\poyaochuang\AppData\Roaming\Folding@home-gpu


[04:52:13] - Ask before connecting: No
[04:52:13] - User name: poyaochuang (Team 3213)
[04:52:13] - User ID: 4C9B6D75335928D0
[04:52:13] - Machine ID: 2
[04:52:13]
[04:52:13] Loaded queue successfully.
[04:52:13] Initialization complete
[04:52:13]
[04:52:13] + Processing work unit
[04:52:13] Core required: FahCore_11.exe
[04:52:13] Core found.
[04:52:13] Working on queue slot 03 [April 22 04:52:13 UTC]
[04:52:13] + Working ...
[04:52:13]
[04:52:13] *------------------------------*
[04:52:13] Folding@Home GPU Core - Beta
[04:52:13] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[04:52:13]
[04:52:13] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[04:52:13] Build host: amoeba
[04:52:13] Board Type: Nvidia
[04:52:13] Core      :
[04:52:13] Preparing to commence simulation
[04:52:13] - Looking at optimizations...
[04:52:13] - Files status OK
[04:52:13] - Expanded 70168 -> 360060 (decompressed 513.1 percent)
[04:52:13] Called DecompressByteArray: compressed_data_size=70168 data_size=360060, decompressed_data_size=360060 diff=0
[04:52:13] - Digital signature verified
[04:52:13]
[04:52:13] Project: 5758 (Run 4, Clone 218, Gen 25)
[04:52:13]
[04:52:14] Assembly optimizations on if available.
[04:52:14] Entering M.D.
[04:52:19] Will resume from checkpoint file
[04:52:20] Working on Protein
[04:52:23] Client config found, loading data.
[04:52:23] Starting GUI Server
[04:52:23] Resuming from checkpoint
[04:52:23] Verified work/wudata_03.log
[04:52:23] Verified work/wudata_03.edr
[04:52:23] Verified work/wudata_03.xtc
[04:52:23] Completed 36%
[04:57:20] Completed 37%
[05:02:18] Completed 38%
[05:07:17] Completed 39%
[05:12:15] Completed 40%
[05:17:13] Completed 41%
[05:22:12] Completed 42%
[05:27:10] Completed 43%
[05:32:06] Completed 44%
[05:35:54] Run: exception thrown during GuardedRun
[05:35:54] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[05:35:54] Going to send back what have done -- stepsTotalG=10000000
[05:35:54] Work fraction=0.4477 steps=10000000.
[05:35:58] logfile size=16611 infoLength=16611 edr=0 trr=23
[05:35:58] - Writing 17147 bytes of core data to disk...
[05:35:58] Done: 16635 -> 4707 (compressed to 28.2 percent)
[05:35:58]   ... Done.
[05:35:58]
[05:35:58] Folding@home Core Shutdown: UNSTABLE_MACHINE
[05:36:02] CoreStatus = 7A (122)
[05:36:02] Sending work to server
[05:36:02] Project: 5758 (Run 4, Clone 218, Gen 25)


[05:36:02] + Attempting to send results [April 22 05:36:02 UTC]
[05:36:03] - Couldn't send HTTP request to server
[05:36:03] + Could not connect to Work Server (results)
[05:36:03]     (171.64.65.106:8080)
[05:36:03] + Retrying using alternative port
[05:36:04] - Couldn't send HTTP request to server
[05:36:04] + Could not connect to Work Server (results)
[05:36:04]     (171.64.65.106:80)
[05:36:04] - Error: Could not transmit unit 03 (completed April 22) to work server.
[05:36:04]   Keeping unit 03 in queue.
[05:36:04] Project: 5758 (Run 4, Clone 218, Gen 25)


[05:36:04] + Attempting to send results [April 22 05:36:04 UTC]
[05:36:05] - Couldn't send HTTP request to server
[05:36:05] + Could not connect to Work Server (results)
[05:36:05]     (171.64.65.106:8080)
[05:36:05] + Retrying using alternative port
[05:36:06] - Couldn't send HTTP request to server
[05:36:06] + Could not connect to Work Server (results)
[05:36:06]     (171.64.65.106:80)
[05:36:06] - Error: Could not transmit unit 03 (completed April 22) to work server.


[05:36:06] + Attempting to send results [April 22 05:36:06 UTC]
[05:36:06] - Couldn't send HTTP request to server
[05:36:06]   (Got status 503)
[05:36:06] + Could not connect to Work Server (results)
[05:36:06]     (171.67.108.25:8080)
[05:36:06] + Retrying using alternative port
[05:36:06] - Couldn't send HTTP request to server
[05:36:06]   (Got status 503)
[05:36:06] + Could not connect to Work Server (results)
[05:36:06]     (171.67.108.25:80)
[05:36:06]   Could not transmit unit 03 to Collection server; keeping in queue.
[05:36:06] - Preparing to get new work unit...
[05:36:06] + Attempting to get work packet
[05:36:06] - Connecting to assignment server
[05:36:06] - Successful: assigned to (171.67.108.11).
[05:36:06] + News From Folding@Home: GPU folding beta
[05:36:06] Loaded queue successfully.
[05:36:07] Project: 5758 (Run 4, Clone 218, Gen 25)


[05:36:07] + Attempting to send results [April 22 05:36:07 UTC]
[05:36:09] - Couldn't send HTTP request to server
[05:36:09] + Could not connect to Work Server (results)
[05:36:09]     (171.64.65.106:8080)
[05:36:09] + Retrying using alternative port
[05:36:10] - Couldn't send HTTP request to server
[05:36:10] + Could not connect to Work Server (results)
[05:36:10]     (171.64.65.106:80)
[05:36:10] - Error: Could not transmit unit 03 (completed April 22) to work server.


[05:36:10] + Attempting to send results [April 22 05:36:10 UTC]
[05:36:10] - Couldn't send HTTP request to server
[05:36:10]   (Got status 503)
[05:36:10] + Could not connect to Work Server (results)
[05:36:10]     (171.67.108.25:8080)
[05:36:10] + Retrying using alternative port
[05:36:10] - Couldn't send HTTP request to server
[05:36:10]   (Got status 503)
[05:36:10] + Could not connect to Work Server (results)
[05:36:10]     (171.67.108.25:80)
[05:36:10]   Could not transmit unit 03 to Collection server; keeping in queue.
[05:36:10] + Closed connections
[05:36:15]
[05:36:15] + Processing work unit
[05:36:15] Core required: FahCore_11.exe
[05:36:15] Core found.
[05:36:15] Working on queue slot 04 [April 22 05:36:15 UTC]
[05:36:15] + Working ...
[05:36:15]
[05:36:15] *------------------------------*
[05:36:15] Folding@Home GPU Core - Beta
[05:36:15] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[05:36:15]
[05:36:15] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[05:36:15] Build host: amoeba
[05:36:15] Board Type: Nvidia
[05:36:15] Core      :
[05:36:15] Preparing to commence simulation
[05:36:15] - Looking at optimizations...
[05:36:15] - Created dyn
[05:36:15] - Files status OK
[05:36:15] - Expanded 45384 -> 251112 (decompressed 553.3 percent)
[05:36:15] Called DecompressByteArray: compressed_data_size=45384 data_size=251112, decompressed_data_size=251112 diff=0
[05:36:15] - Digital signature verified
[05:36:15]
[05:36:15] Project: 5772 (Run 0, Clone 296, Gen 276)
[05:36:15]
[05:36:15] Assembly optimizations on if available.
[05:36:15] Entering M.D.
[05:36:22] Working on Protein
[05:36:23] Client config found, loading data.
[05:36:23] Starting GUI Server
[05:39:54] Completed 1%
[05:43:25] Completed 2%
[05:47:00] Completed 3%
[05:50:32] Completed 4%
[05:54:03] Completed 5%
[05:57:35] Completed 6%
[06:01:06] Completed 7%
[06:04:37] Completed 8%
[06:08:08] Completed 9%
[06:11:40] Completed 10%
[06:13:30] + Paused
poyaochuang
 
Posts: 37
Joined: Sun Apr 12, 2009 3:26 am

Re: Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby bruce » Wed Apr 22, 2009 6:44 am

Hi poyaochuang (team 3213),
Your WU (P5758 R10 C237 G15) was added to the stats database on 2009-04-21 14:13:41 for 384 points of credit.

Hi poyaochuang (team 3213),
Your WU (P5778 R2 C429 G33) was added to the stats database on 2009-04-21 18:12:12 for 239.79 points of credit and nobody else has returned this one yet.

Neither Project: 5758 (Run 4, Clone 218, Gen 25) nor Project: 5772 (Run 0, Clone 296, Gen 276) have yet been returned by you or anybody else.

Project 5758 is on 171.64.65.106 which has had a very high CPU load although it has been declining. It was down from 21:15 to 23:15 Stanford time which covers the times shown in your log -- [03:15) UTC to [06:15]. (You might as will ignore the CS 171.67.108.25 because it has been completely overloaded, which is why you got the 503 error.)

Project 5772 is on server 171.67.108.11 which seems to be operating normally but probably by the time you finish it, the P5758 WU will upload, too.

I can't explain the EUEs, but the uploading process is working as it was designed when there are problems with the servers.
bruce
Site Admin
 
Posts: 16882
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

error

Postby poyaochuang » Thu Apr 23, 2009 12:07 am

Code: Select all
[21:48:45] Project: 5760 (Run 4, Clone 158, Gen 18)
[21:48:45]
[21:48:45] Assembly optimizations on if available.
[21:48:46] Entering M.D.
[21:48:52] Working on Protein
[21:48:56] Client config found, loading data.
[21:48:57] Starting GUI Server
[21:53:50] Completed 1%
[21:58:45] Completed 2%
[22:03:39] Completed 3%
[22:08:34] Completed 4%
[22:13:28] Completed 5%
[22:18:23] Completed 6%
[22:23:17] Completed 7%
[22:28:12] Completed 8%
[22:33:06] Completed 9%
[22:36:34] Run: exception thrown during GuardedRun
[22:36:34] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[22:36:34] Going to send back what have done -- stepsTotalG=10000000
[22:36:34] Work fraction=0.0970 steps=10000000.
[22:36:38] logfile size=0 infoLength=0 edr=0 trr=23
[22:36:38] - Writing 642 bytes of core data to disk...
[22:36:38] Done: 130 -> 127 (compressed to 97.6 percent)
[22:36:38]   ... Done.
[22:36:39]
[22:36:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:36:42] CoreStatus = 7A (122)
[22:36:42] Sending work to server
poyaochuang
 
Posts: 37
Joined: Sun Apr 12, 2009 3:26 am

error, window said "video driver stopped working, but now is

Postby poyaochuang » Thu Apr 23, 2009 2:15 pm

window error mesage on notification area
"video driver stop working, now its working" or something like that


Code: Select all
[11:49:38] + Processing work unit
[11:49:38] Core required: FahCore_11.exe
[11:49:38] Core found.
[11:49:38] Working on queue slot 09 [April 23 11:49:38 UTC]
[11:49:38] + Working ...
[11:49:38]
[11:49:38] *------------------------------*
[11:49:38] Folding@Home GPU Core - Beta
[11:49:38] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[11:49:38]
[11:49:38] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[11:49:38] Build host: amoeba
[11:49:38] Board Type: Nvidia
[11:49:38] Core      :
[11:49:38] Preparing to commence simulation
[11:49:38] - Looking at optimizations...
[11:49:38] - Created dyn
[11:49:38] - Files status OK
[11:49:38] - Expanded 68533 -> 357580 (decompressed 521.7 percent)
[11:49:38] Called DecompressByteArray: compressed_data_size=68533 data_size=357580, decompressed_data_size=357580 diff=0
[11:49:38] - Digital signature verified
[11:49:38]
[11:49:38] Project: 5760 (Run 3, Clone 422, Gen 18)
[11:49:38]
[11:49:38] Assembly optimizations on if available.
[11:49:38] Entering M.D.
[11:49:45] Working on Protein
[11:49:48] Client config found, loading data.
[11:49:49] Starting GUI Server
[11:54:43] Completed 1%
[11:59:38] Completed 2%
[12:04:33] Completed 3%
[12:09:27] Completed 4%
[12:14:22] Completed 5%
[12:19:17] Completed 6%
[12:24:12] Completed 7%
[12:29:07] Completed 8%
[12:34:01] Completed 9%
[12:38:57] Completed 10%
[12:43:51] Completed 11%
[12:48:46] Completed 12%
[12:53:41] Completed 13%
[12:58:36] Completed 14%
[13:03:30] Completed 15%
[13:08:26] Completed 16%
[13:13:20] Completed 17%
[13:18:15] Completed 18%
[13:23:10] Completed 19%
[13:28:05] Completed 20%
[13:33:00] Completed 21%
[13:37:55] Completed 22%
[13:38:11] Run: exception thrown during GuardedRun
[13:38:11] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[13:38:11] Going to send back what have done -- stepsTotalG=10000000
[13:38:11] Work fraction=0.2205 steps=10000000.
[13:38:15] logfile size=0 infoLength=0 edr=0 trr=23
[13:38:15] - Writing 642 bytes of core data to disk...
[13:38:15] Done: 130 -> 127 (compressed to 97.6 percent)
[13:38:15]   ... Done.
[13:38:16]
[13:38:16] Folding@home Core Shutdown: UNSTABLE_MACHINE
[13:38:19] CoreStatus = 7A (122)
[13:38:19] Sending work to server
[13:38:19] Project: 5760 (Run 3, Clone 422, Gen 18)


[13:38:19] + Attempting to send results [April 23 13:38:19 UTC]
[13:38:19] + Results successfully sent
[13:38:19] Thank you for your contribution to Folding@Home.
[13:38:23] - Preparing to get new work unit...
[13:38:23] + Attempting to get work packet
[13:38:23] - Connecting to assignment server
[13:38:23] - Successful: assigned to (171.64.65.106).
[13:38:23] + News From Folding@Home: GPU folding beta
[13:38:23] Loaded queue successfully.
[13:38:24] + Could not connect to Work Server
[13:38:24] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[13:38:41] + Attempting to get work packet
[13:38:41] - Connecting to assignment server
[13:38:41] - Successful: assigned to (171.64.65.106).
[13:38:41] + News From Folding@Home: GPU folding beta
[13:38:41] Loaded queue successfully.
[13:38:42] + Closed connections
poyaochuang
 
Posts: 37
Joined: Sun Apr 12, 2009 3:26 am

Re: error, window said "video driver stopped working, but now is

Postby bruce » Thu Apr 23, 2009 7:19 pm

poyaochuang wrote:window error mesage on notification area
"video driver stop working, now its working" or something like that


This is a complete guess, but you might search the ATI discussions for topics discussing VPU recover issues. It may be the same sort of issue. Both the drivers and the OS check that the GPU has responded within a certain time-frame. ATI and NV handle this quite differently so I don't expect you'll find a fix, but it may broaden your understanding of the problem.

I'm not really knowledgeable in GPU programming, but I think this might be part of the issue: If FahCore_11 gives the GPU a block of work that takes too long, it might be possible for the GPU to stay busy "too long" and to appear to busy when it's not. Just passing a lot of really short blocks of work is less efficient. This means there might be some sort of adjustments in the drivers or FahCore that have to trade off efficiency with stability. Once again, this is mostly speculation on my part.
bruce
Site Admin
 
Posts: 16882
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

help help folding@home

Postby poyaochuang » Mon May 04, 2009 10:56 am

its happening like every other WU or more.
I check the video card with nvidia memory tester,
and its working perfectly,
can someone help help folding@home?

Code: Select all
[04:59:08] + Starting local stats count at 1
[04:59:12] - Preparing to get new work unit...
[04:59:12] + Attempting to get work packet
[04:59:12] - Connecting to assignment server
[04:59:13] - Successful: assigned to (171.64.122.70).
[04:59:13] + News From Folding@Home: GPU folding beta
[04:59:13] Loaded queue successfully.
[04:59:13] + Closed connections
[04:59:13]
[04:59:13] + Processing work unit
[04:59:13] Core required: FahCore_14.exe
[04:59:13] Core found.
[04:59:13] Working on queue slot 02 [May 4 04:59:13 UTC]
[04:59:13] + Working ...
[04:59:13]
[04:59:13] *------------------------------*
[04:59:13] Folding@Home GPU Core - Beta
[04:59:13] Version 1.25 (Mon Mar 2 19:49:32 PST 2009)
[04:59:13]
[04:59:13] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[04:59:13] Build host: vspm46
[04:59:13] Board Type: Nvidia
[04:59:13] Core      :
[04:59:13] Preparing to commence simulation
[04:59:13] - Looking at optimizations...
[04:59:13] - Created dyn
[04:59:13] - Files status OK
[04:59:13] - Expanded 70256 -> 360060 (decompressed 512.4 percent)
[04:59:13] Called DecompressByteArray: compressed_data_size=70256 data_size=360060, decompressed_data_size=360060 diff=0
[04:59:13] - Digital signature verified
[04:59:13]
[04:59:13] Project: 5900 (Run 6, Clone 542, Gen 22)
[04:59:13]
[04:59:13] Assembly optimizations on if available.
[04:59:13] Entering M.D.
[04:59:20] Tpr hash work/wudata_02.tpr:  3513464470 3047861161 3032797338 2212940657 501227358
[04:59:21] Working on Protein
[04:59:23] Client config found, loading data.
[04:59:23] Starting GUI Server
[05:04:23] Completed 1%
[05:10:52] Completed 2%
[05:17:19] Completed 3%
[05:24:13] Completed 4%
[05:30:35] Completed 5%
[05:36:48] Completed 6%
[05:43:08] Completed 7%
[05:49:29] Completed 8%
[05:55:17] Completed 9%
[06:01:58] Completed 10%
[06:08:19] Completed 11%
[06:14:42] Completed 12%
[06:19:55] Opening http://fah-web.stanford.edu/cgi-bin/main.py?qtype=userpage&username=poyaochuang...
[06:20:59] Completed 13%
[06:26:52] Completed 14%
[06:33:53] Completed 15%
[06:41:02] Completed 16%
[06:47:34] Completed 17%
[06:53:21] Completed 18%
[06:59:38] Completed 19%
[07:06:47] Completed 20%
[07:12:43] Completed 21%
[07:18:45] Completed 22%
[07:25:11] Completed 23%
[07:31:25] Completed 24%
[07:38:08] Completed 25%
[07:44:22] Completed 26%
[07:48:16] Opening http://foldingforum.org/...
[07:51:50] Completed 27%
[07:58:51] Completed 28%
[08:04:54] Completed 29%
[08:11:10] Completed 30%
[08:17:17] Completed 31%
[08:23:53] Completed 32%
[08:29:40] Completed 33%
[08:35:43] Completed 34%
[08:41:49] Completed 35%
[08:47:56] Completed 36%
[08:54:09] Completed 37%
[09:00:10] Completed 38%
[09:05:32] Completed 39%
[09:11:55] Completed 40%
[09:17:42] Completed 41%
[09:22:14] SEH code: 3221225477
[09:22:14] Run: exception thrown during GuardedRun
[09:22:14] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[09:22:14] Going to send back what have done -- stepsTotalG=2000000
[09:22:14] Work fraction=0.4176 steps=2000000.
[09:22:18] logfile size=0 infoLength=0 edr=0 trr=23
[09:22:18] - Writing 641 bytes of core data to disk...
[09:22:18] Done: 129 -> 127 (compressed to 98.4 percent)
[09:22:18]   ... Done.
poyaochuang
 
Posts: 37
Joined: Sun Apr 12, 2009 3:26 am

[Solved]exception thrown in GuardedRun -- Gromacs cannot con

Postby extrasalty » Mon May 11, 2009 2:10 am

I'm getting pretty much ALL 5900s to terminate early with the same SEH code- all because of user interaction- scrolling, videos, windows. I'm on Win7 too. In the same time there are almost no reports about such problem. So it seems the problem is isolated on Win7. I personally don't care as long as the computer doesn't BSOD (which it did under XP).

edit: Changing the GPU client priority to "below normal" fixed it for me. I used WinAFC. :D
MSI Eclipse i7@3.45GHz
2x GTX260@702/1053/1512 Win 7+VMware+linuxfah VM+bigadv

Image
extrasalty
 
Posts: 170
Joined: Sat Jun 21, 2008 8:39 pm
Location: Las Vegas, NV

Re: Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby tonic » Mon Jun 08, 2009 3:35 pm

Same issue here. I have a GTX260 and all of the 511 point WUS throw guarded run errors. My video drivers crash and then come back. I've even tried underclocking, but this didn't help. I'm on the 185.85 drivers in Windows 7 (64bit).

All other WUs seem to process fine, it's just the 511 (which there are a lot of these days). It doesn't happen at a specific point either, some will make it only through ~5% and some will crash at 80%+. A couple have even finished.

Any suggestions? I'll try setting the priority lower, but I'm guessing PPD will take a big hit given that I'm running an SMP as well.
Image
tonic
 
Posts: 136
Joined: Sat Aug 02, 2008 4:05 am
Location: Seattle, WA

Re: Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby bruce » Mon Jun 08, 2009 4:04 pm

I'm not sure we can help you. They're still developing the drivers. When you run pre-release software there are likely things that don't work perfectly, and that's even more likely true for Win-64 than Win-32.

Underclocking seems like a reasonable thing to try but in this case it doesn't seem to help. Do check that your GPU has plenty of airflow. (Adjusted fan profile, a GPU fan that vents the hot heat outside rather than into the case, plenty of case airflow, etc.) These WUs are larger proteins than we've had before (1392 atoms) and they push the hardware to higher temperatures than those with lower atom counts.
bruce
Site Admin
 
Posts: 16882
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby tonic » Mon Jun 08, 2009 8:10 pm

I wasn't aware that these were "pre-release drivers"? There is no mention of that on the Nvidia site. I don't think this has anything to do with heat, I was running vista before with the same fan speed and didn't have this issue with the 511 point WUs.
tonic
 
Posts: 136
Joined: Sat Aug 02, 2008 4:05 am
Location: Seattle, WA

Re: Run: exception thrown in GuardedRun -- Gromacs cannot contin

Postby bruce » Mon Jun 08, 2009 8:25 pm

I didn't say they were pre-release drivers. I simply said that they are still revising them.

Windows 7 is pre-release, and as long as everything works for you, good . . . but if it doesn't there's not a lot that can be done. It's never clear whether the latest version of drivers will be good enough to call them "released" on the day that Microsoft calls Win7 "released" but we can hope.
bruce
Site Admin
 
Posts: 16882
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.


Return to NVIDIA specific issues

Who is online

Users browsing this forum: Yandex [Bot] and 1 guest