Run: exception thrown during GuardedRun

Moderators: slegrand, Site Moderators, PandeGroup

Run: exception thrown during GuardedRun

Postby wickedwahine » Thu Jun 18, 2009 5:39 am

Could someone tell me what this error means? Does it apply to the work unit or my gpu? Am I sending partial results? Is that k?

Code: Select all
[01:55:04] Folding@Home GPU Core - Beta
[01:55:04] Version 1.25 (Mon Mar 2 19:49:32 PST 2009)
[01:55:04]
[01:55:04] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[01:55:04] Build host: vspm46
[01:55:04] Board Type: Nvidia
[01:55:04] Core      :
[01:55:04] Preparing to commence simulation
[01:55:04] - Looking at optimizations...
[01:55:05] - Created dyn
[01:55:05] - Files status OK
[01:55:05] - Expanded 68467 -> 357580 (decompressed 522.2 percent)
[01:55:05] Called DecompressByteArray: compressed_data_size=68467 data_size=357580, decompressed_data_size=357580 diff=0
[01:55:05] - Digital signature verified
[01:55:05]
[01:55:05] Project: 5905 (Run 13, Clone 855, Gen 4)
[01:55:05]
[01:55:05] Assembly optimizations on if available.
[01:55:05] Entering M.D.
[01:55:11] Tpr hash work/wudata_03.tpr:  289605997 2824300753 2919185384 579787205 1140442469
[01:55:12] Working on Protein
[01:55:13] Client config found, loading data.
[01:55:13] Starting GUI Server
[01:59:57] Completed 1%
[02:05:09] Completed 2%
[02:10:29] Completed 3%
[02:15:33] Completed 4%
[02:20:51] Completed 5%
[02:26:12] Completed 6%
[02:31:20] Completed 7%
[02:36:32] Completed 8%
[02:41:43] Completed 9%
[02:46:56] Completed 10%
[02:52:03] Completed 11%
[02:57:13] Completed 12%
[03:02:08] Completed 13%
[03:07:17] Completed 14%
[03:12:26] Completed 15%
[03:17:28] Completed 16%
[03:22:25] Completed 17%
[03:27:18] Completed 18%
[03:32:26] Completed 19%
[03:37:31] Completed 20%
[03:42:27] Completed 21%
[03:47:35] Completed 22%
[03:52:37] Completed 23%
[03:56:08] SEH code: 3221225477
[color=#FF0040][03:56:08] Run: exception thrown during GuardedRun
[03:56:08] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[03:56:08] Going to send back what have done -- stepsTotalG=8000000[/color]
[03:56:08] Work fraction=0.2371 steps=8000000.
[03:56:12] logfile size=11154 infoLength=11154 edr=0 trr=23
[03:56:12] - Writing 11690 bytes of core data to disk...
[03:56:12] Done: 11178 -> 3929 (compressed to 35.1 percent)
[03:56:12]   ... Done.
[03:56:12]
[03:56:12] Folding@home Core Shutdown: EARLY_UNIT_END
[03:56:14] CoreStatus = 72 (114)
[03:56:14] Sending work to server
[03:56:14] Project: 5905 (Run 13, Clone 855, Gen 4)
[03:56:14] - Read packet limit of 540015616... Set to 524286976.


[03:56:14] + Attempting to send results [June 18 03:56:14 UTC]
[03:56:14] + Results successfully sent
[03:56:14] Thank you for your contribution to Folding@Home.
[03:56:18] - Preparing to get new work unit...
[03:56:18] + Attempting to get work packet
[03:56:18] - Connecting to assignment server
[03:56:19] - Successful: assigned to (171.64.65.20).
[03:56:19] + News From Folding@Home: Welcome to Folding@Home
[03:56:19] Loaded queue successfully.
[03:56:20] + Closed connections
[03:56:25]
[03:56:25] + Processing work unit
[03:56:25] Core required: FahCore_14.exe
[03:56:25] Core found.
[03:56:25] Working on queue slot 04 [June 18 03:56:25 UTC]
[03:56:25] + Working ...
[03:56:25]
[03:56:25] *------------------------------*
[03:56:25] Folding@Home GPU Core - Beta
[03:56:25] Version 1.25 (Mon Mar 2 19:49:32 PST 2009)
[03:56:25]
[03:56:25] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:56:25] Build host: vspm46
[03:56:25] Board Type: Nvidia
[03:56:25] Core      :
[03:56:25] Preparing to commence simulation
[03:56:25] - Looking at optimizations...
[03:56:26] - Created dyn
[03:56:26] - Files status OK
[03:56:26] - Expanded 68662 -> 357580 (decompressed 520.7 percent)
[03:56:26] Called DecompressByteArray: compressed_data_size=68662 data_size=357580, decompressed_data_size=357580 diff=0
[03:56:26] - Digital signature verified
[03:56:26]
[03:56:26] Project: 5911 (Run 5, Clone 484, Gen 7)
[03:56:26]
[03:56:26] Assembly optimizations on if available.
[03:56:26] Entering M.D.
[03:56:32] Tpr hash work/wudata_04.tpr:  1865203828 1371658067 1035983887 3831146350 427841558
[03:56:32] Working on Protein
[03:56:33] Client config found, loading data.
[03:56:33] Starting GUI Server
[04:01:20] Completed 1%
[04:05:09] SEH code: 3221225477
[color=#FF0000][04:05:09] Run: exception thrown during GuardedRun
[04:05:09] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[04:05:09] Going to send back what have done -- stepsTotalG=8000000[/color]
[04:05:09] Work fraction=0.0169 steps=8000000.
[04:05:13] logfile size=14923 infoLength=14923 edr=0 trr=23
[04:05:13] - Writing 15459 bytes of core data to disk...
[04:05:13] Done: 14947 -> 4109 (compressed to 27.4 percent)
[04:05:13]   ... Done.
[04:05:13]
[04:05:13] Folding@home Core Shutdown: UNSTABLE_MACHINE
[04:05:15] CoreStatus = 7A (122)
[04:05:15] Sending work to server
[04:05:15] Project: 5911 (Run 5, Clone 484, Gen 7)
[04:05:15] - Read packet limit of 540015616... Set to 524286976.


[04:05:15] + Attempting to send results [June 18 04:05:15 UTC]
[04:05:16] + Results successfully sent
[04:05:16] Thank you for your contribution to Folding@Home.
[04:05:20] EUE limit exceeded. Pausing 24 hours.

Folding@Home Client Shutdown.


Thank you for your help
wickedwahine
 
Posts: 35
Joined: Wed Jul 16, 2008 11:16 pm

Re: Run: exception thrown during GuardedRun

Postby Gormar » Thu Jun 18, 2009 6:14 am

It is not OK. You've got only errors.
Which driver, operating system do you use ?
Do you overclock your graphic card ?
Maybe it is overheating. Do you monitor GPU and CPU temperatures?
Gormar
 
Posts: 122
Joined: Fri Apr 18, 2008 7:33 am

Re: Run: exception thrown during GuardedRun

Postby wickedwahine » Thu Jun 18, 2009 6:32 am

Gormar wrote:It is not OK. You've got only errors.
Which driver, operating system do you use ?
Do you overclock your graphic card ?
Maybe it is overheating. Do you monitor GPU and CPU temperatures?


182.08 drivers Client 6.20
Vista Ultimate 64
No OC on anything, work and fold rig, need it stable, I'm a complete noob
No overheating, Precision shows 60c on 9800GTX+ 80c on 8800GS and CPU 38c - 41c
Never had any problems folding for a year until recent weeks
8800GS folds completely fine

I have other posts on these problems, but no answers.
Reinstalled F@H, drivers, ran memtestg80, ran one card only, moved to different slot, 9800GTX+ still giving errors
PSU 700W
:e(
wickedwahine
 
Posts: 35
Joined: Wed Jul 16, 2008 11:16 pm

Re: Run: exception thrown during GuardedRun

Postby Gormar » Thu Jun 18, 2009 3:40 pm

Did you run some test of this card?
Try memtestG80.
Maybe card is defective.
Gormar
 
Posts: 122
Joined: Fri Apr 18, 2008 7:33 am

Re: Run: exception thrown during GuardedRun

Postby wickedwahine » Thu Jun 18, 2009 11:40 pm

Gormar wrote:Did you run some test of this card?
Try memtestG80.
Maybe card is defective.


Errrr... see my post above yours... I DID run memtestg80.... no errors. RMA for this card (9800gtx+) I don't know what else to do
wickedwahine
 
Posts: 35
Joined: Wed Jul 16, 2008 11:16 pm

Re: Run: exception thrown during GuardedRun

Postby Fahrenheit451 » Thu Nov 04, 2010 7:17 pm

This thread is old but the title fits to my question.
Today HFM.net shows a failed WU for my GPU client (6.23 console). As the client still runs when I found this information I took a look into FAHlog.txt. There I found the error message from the thread title. The WU (Project: 10112 (Run 112, Clone 0, Gen 6)) was partially finished and sent to Stanford, then the client continued with the next WU (Project: 10112 (Run 691, Clone 0, Gen 9)) and worked until now with no further issues:

Code: Select all
[08:41:02] Folding@home Core Shutdown: FINISHED_UNIT
[08:41:06] CoreStatus = 64 (100)
[08:41:06] Unit 9 finished with 96 percent of time to deadline remaining.
[08:41:06] Updated performance fraction: 0.969841
[08:41:06] Sending work to server
[08:41:06] Project: 10112 (Run 112, Clone 0, Gen 5)
[08:41:06] - Read packet limit of 540015616... Set to 524286976.


[08:41:06] + Attempting to send results [November 4 08:41:06 UTC]
[08:41:06] - Reading file work/wuresults_09.dat from core
[08:41:06]   (Read 130475 bytes from disk)
[08:41:06] Connecting to http://171.64.65.71:8080/
[08:41:09] Posted data.
[08:41:09] Initial: 0000; - Uploaded at ~42 kB/s
[08:41:09] - Averaged speed for that direction ~49 kB/s
[08:41:09] + Results successfully sent
[08:41:09] Thank you for your contribution to Folding@Home.
[08:41:09] + Number of Units Completed: 148

[08:41:13] Trying to send all finished work units
[08:41:13] + No unsent completed units remaining.
[08:41:13] - Preparing to get new work unit...
[08:41:13] + Attempting to get work packet
[08:41:13] - Will indicate memory of 2045 MB
[08:41:13] - Connecting to assignment server
[08:41:13] Connecting to http://assign-GPU.stanford.edu:8080/
[08:41:14] Posted data.
[08:41:14] Initial: 40AB; - Successful: assigned to (171.64.65.71).
[08:41:14] + News From Folding@Home: Welcome to Folding@Home
[08:41:14] Loaded queue successfully.
[08:41:14] Connecting to http://171.64.65.71:8080/
[08:41:15] Posted data.
[08:41:15] Initial: 0000; - Receiving payload (expected size: 82635)
[08:41:16] - Downloaded at ~80 kB/s
[08:41:16] - Averaged speed for that direction ~75 kB/s
[08:41:16] + Received work.
[08:41:16] Trying to send all finished work units
[08:41:16] + No unsent completed units remaining.
[08:41:16] + Closed connections
[08:41:16]
[08:41:16] + Processing work unit
[08:41:16] Core required: FahCore_11.exe
[08:41:16] Core found.
[08:41:16] Working on queue slot 00 [November 4 08:41:16 UTC]
[08:41:16] + Working ...
[08:41:16] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -checkpoint 5 -verbose -lifeline 4920 -version 623'

[08:41:16]
[08:41:16] *------------------------------*
[08:41:16] Folding@Home GPU Core
[08:41:16] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[08:41:16]
[08:41:16] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[08:41:16] Build host: amoeba
[08:41:16] Board Type: Nvidia
[08:41:16] Core      :
[08:41:16] Preparing to commence simulation
[08:41:16] - Looking at optimizations...
[08:41:16] DeleteFrameFiles: successfully deleted file=work/wudata_00.ckp
[08:41:16] - Created dyn
[08:41:16] - Files status OK
[08:41:16] - Expanded 82123 -> 425170 (decompressed 517.7 percent)
[08:41:16] Called DecompressByteArray: compressed_data_size=82123 data_size=425170, decompressed_data_size=425170 diff=0
[08:41:16] - Digital signature verified
[08:41:16]
[08:41:16] Project: 10112 (Run 112, Clone 0, Gen 6)
[08:41:16]
[08:41:16] Assembly optimizations on if available.
[08:41:16] Entering M.D.
[08:41:22] Tpr hash work/wudata_00.tpr:  162307656 3393344805 2149626490 322167602 4280091129
[08:41:22]
[08:41:22] Calling fah_main args: 14 usage=100
[08:41:22]
[08:41:23] Working on 1174 p10112_ubiquitin_300K
[08:41:24] Client config found, loading data.
[08:41:24] Starting GUI Server
[08:42:55] Completed 1%
[08:44:27] Completed 2%
[08:45:58] Completed 3%
[08:47:29] Completed 4%
[08:49:00] Completed 5%
[08:50:32] Completed 6%
[08:52:03] Completed 7%
[08:53:34] Completed 8%
[08:55:06] Completed 9%
[08:56:37] Completed 10%
[08:58:08] Completed 11%
[08:59:40] Completed 12%
[09:01:11] Completed 13%
[09:02:42] Completed 14%
[09:04:14] Completed 15%
[09:05:46] Completed 16%
[09:07:17] Completed 17%
[09:08:49] Completed 18%
[09:10:21] Completed 19%
[09:11:52] Completed 20%
[09:13:23] Completed 21%
[09:14:55] Completed 22%
[09:16:26] Completed 23%
[09:17:57] Completed 24%
[09:19:29] Completed 25%
[09:21:00] Completed 26%
[09:22:32] Completed 27%
[09:24:03] Completed 28%
[09:25:35] Completed 29%
[09:27:07] Completed 30%
[09:28:39] Completed 31%
[09:30:00] Run: exception thrown during GuardedRun
[09:30:00] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[09:30:00] Going to send back what have done -- stepsTotalG=10000000
[09:30:00] Work fraction=0.3188 steps=10000000.
[09:30:04] logfile size=10928 infoLength=10928 edr=0 trr=23
[09:30:04] + Opened results file
[09:30:04] - Writing 11464 bytes of core data to disk...
[09:30:04] Done: 10952 -> 4002 (compressed to 36.5 percent)
[09:30:04]   ... Done.
[09:30:04] DeleteFrameFiles: successfully deleted file=work/wudata_00.ckp
[09:30:04]
[09:30:04] Folding@home Core Shutdown: UNSTABLE_MACHINE
[09:30:07] CoreStatus = 7A (122)
[09:30:07] Sending work to server
[09:30:07] Project: 10112 (Run 112, Clone 0, Gen 6)
[09:30:07] - Read packet limit of 540015616... Set to 524286976.


[09:30:07] + Attempting to send results [November 4 09:30:07 UTC]
[09:30:07] - Reading file work/wuresults_00.dat from core
[09:30:08]   (Read 4514 bytes from disk)
[09:30:08] Connecting to http://171.64.65.71:8080/
[09:30:08] Posted data.
[09:30:08] Initial: 0000; Conversation time very short, giving reduced weight in bandwidth avg
[09:30:08] - Uploaded at ~10 kB/s
[09:30:08] - Averaged speed for that direction ~45 kB/s
[09:30:08] + Results successfully sent
[09:30:08] Thank you for your contribution to Folding@Home.
[09:30:12] Trying to send all finished work units
[09:30:12] + No unsent completed units remaining.
[09:30:12] - Preparing to get new work unit...
[09:30:12] + Attempting to get work packet
[09:30:12] - Will indicate memory of 2045 MB
[09:30:12] - Connecting to assignment server
[09:30:12] Connecting to http://assign-GPU.stanford.edu:8080/
[09:30:13] Posted data.
[09:30:13] Initial: 40AB; - Successful: assigned to (171.64.65.71).
[09:30:13] + News From Folding@Home: Welcome to Folding@Home
[09:30:13] Loaded queue successfully.
[09:30:13] Connecting to http://171.64.65.71:8080/
[09:30:14] Posted data.
[09:30:14] Initial: 0000; - Receiving payload (expected size: 82395)
[09:30:15] - Downloaded at ~80 kB/s
[09:30:15] - Averaged speed for that direction ~76 kB/s
[09:30:15] + Received work.
[09:30:15] Trying to send all finished work units
[09:30:15] + No unsent completed units remaining.
[09:30:15] + Closed connections
[09:30:20]
[09:30:20] + Processing work unit
[09:30:20] Core required: FahCore_11.exe
[09:30:20] Core found.
[09:30:20] Working on queue slot 01 [November 4 09:30:20 UTC]
[09:30:20] + Working ...
[09:30:20] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -checkpoint 5 -verbose -lifeline 4920 -version 623'

[09:30:20]
[09:30:20] *------------------------------*
[09:30:20] Folding@Home GPU Core
[09:30:20] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[09:30:20]
[09:30:20] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[09:30:20] Build host: amoeba
[09:30:20] Board Type: Nvidia
[09:30:20] Core      :
[09:30:20] Preparing to commence simulation
[09:30:20] - Looking at optimizations...
[09:30:20] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[09:30:20] - Created dyn
[09:30:20] - Files status OK
[09:30:20] - Expanded 81883 -> 425170 (decompressed 519.2 percent)
[09:30:20] Called DecompressByteArray: compressed_data_size=81883 data_size=425170, decompressed_data_size=425170 diff=0
[09:30:20] - Digital signature verified
[09:30:20]
[09:30:20] Project: 10112 (Run 691, Clone 0, Gen 9)
[09:30:20]
[09:30:20] Assembly optimizations on if available.
[09:30:20] Entering M.D.
[09:30:26] Tpr hash work/wudata_01.tpr:  2754040985 1632419845 1866688362 1906918406 335510063
[09:30:26]
[09:30:26] Calling fah_main args: 14 usage=100
[09:30:26]
[09:30:27] Working on 1174 p10112_ubiquitin_300K
[09:30:28] Client config found, loading data.
[09:30:29] Starting GUI Server
[09:32:00] Completed 1%


Is this a hardware or a WU specific issue?
Fahrenheit451
 
Posts: 161
Joined: Sun Sep 19, 2010 10:25 am
Location: Bonn, Germany

Re: Run: exception thrown during GuardedRun

Postby bruce » Thu Nov 04, 2010 9:28 pm

I don't see a credit listed for Fahrenheit451 but there's a credit under a different name that might be you. Anyway, one person completed this WU successfully and one didn't.

This error is almost always attirbutable to a hardware failure of some kind. The same suggestions still apply. Run MemtestCL to see if there's a problem that can be identified in your VRAM. Reduce your overclocking. Improve the air circulation. (etc.)
bruce
 
Posts: 23743
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Run: exception thrown during GuardedRun

Postby Fahrenheit451 » Thu Nov 04, 2010 10:24 pm

I don't overclock my system. And air circulation should be ok. The Cosmos 1000 is a real big tower and I have 5 fans inside (1 CPU, 1 GPU, 1 PSU and 2 case fans).
But I will make a Memtest run.
Fahrenheit451
 
Posts: 161
Joined: Sun Sep 19, 2010 10:25 am
Location: Bonn, Germany


Return to NVIDIA specific issues

Who is online

Users browsing this forum: No registered users and 1 guest

cron