Blue screen @ WU end/beginning

Moderators: slegrand, Site Moderators, PandeGroup

Blue screen @ WU end/beginning

Postby Leoslocks » Wed Feb 10, 2010 4:19 am

I am having a problem diagnosing an issue with the following system.
Asus P5Q-E
Vista 64
4GB Crucial ram
Corsair 1K Power Supply
E8500 @ 3.16 GHz
BFG GTX260 x 2 Not SLI

System ran well for quite a while with two cards.
One GTX260 is system tray and the second is Console.
This is a gaming machine and I rotuinely stop the Console and Pause the SysTray before gaming.
Machine runs 24 x 7 during the winter.

I am receiving blues screens at the end of a workunit. Every shut down is as the unit finishes or just after it begins another. Trying to isolate one card as the problem by running one at a time. Two of the blue screens occured after forgeting to stop F@H prior to gaming. running MemtestG80 for 1K itenerations as no errors found on the short test.

Code from the last failure
Code: Select all
[00:55:01] *------------------------------*
[00:55:01] Folding@Home GPU Core
[00:55:01] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[00:55:01]
[00:55:01] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[00:55:01] Build host: amoeba
[00:55:01] Board Type: Nvidia
[00:55:01] Core      :
[00:55:01] Preparing to commence simulation
[00:55:01] - Looking at optimizations...
[00:55:01] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[00:55:01] - Created dyn
[00:55:01] - Files status OK
[00:55:01] - Expanded 46707 -> 252912 (decompressed 541.4 percent)
[00:55:01] Called DecompressByteArray: compressed_data_size=46707 data_size=252912, decompressed_data_size=252912 diff=0
[00:55:01] - Digital signature verified
[00:55:01]
[00:55:01] Project: 5766 (Run 7, Clone 42, Gen 1754)
[00:55:01]
[00:55:01] Assembly optimizations on if available.
[00:55:01] Entering M.D.
[00:55:07] Tpr hash work/wudata_01.tpr:  1190064392 3482330908 1480217738 2843053898 3285021097
[00:55:07]
[00:55:07] Calling fah_main args: 14 usage=100
[00:55:07]
[00:55:08] Working on Protein
[00:55:08] Client config found, loading data.
[00:55:08] Starting GUI Server
[00:55:47] Completed 1%
[00:56:27] Completed 2%
[00:57:06] Completed 3%
[00:57:46] Completed 4%
//////////////////////////////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////////////////////////////////
[02:00:33] Completed 98%
[02:01:12] Completed 99%
[02:01:52] Completed 100%
[02:01:52] Successful run
[02:01:52] DynamicWrapper: Finished Work Unit: sleep=10000
[02:02:02] Reserved 75856 bytes for xtc file; Cosm status=0
[02:02:02] Allocated 75856 bytes for xtc file
[02:02:02] - Reading up to 75856 from "work/wudata_01.xtc": Read 75856
[02:02:02] Read 75856 bytes from xtc file; available packet space=786354608
[02:02:02] xtc file hash check passed.
[02:02:02] Reserved 15168 15168 786354608 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[02:02:02] Allocated 15168 bytes for arc file
[02:02:02] - Reading up to 15168 from "work/wudata_01.trr": Read 15168
[02:02:02] Read 15168 bytes from arc file; available packet space=786339440
[02:02:02] trr file hash check passed.
[02:02:02] Allocated 560 bytes for edr file
[02:02:02] Read bedfile
[02:02:02] edr file hash check passed.
[02:02:02] Allocated 11486 bytes for logfile
[02:02:02] Read logfile
[02:02:02] GuardedRun: success in DynamicWrapper
[02:02:02] GuardedRun: done
[02:02:02] Run: GuardedRun completed.
[02:02:06] + Opened results file
[02:02:06] - Writing 103582 bytes of core data to disk...
[02:02:06] Done: 103070 -> 95650 (compressed to 92.8 percent)
[02:02:06]   ... Done.
[02:02:06] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[02:02:06] Shutting down core
[02:02:06]
[02:02:06] Folding@home Core Shutdown: FINISHED_UNIT
[02:02:10] CoreStatus = 64 (100)
[02:02:10] Sending work to server
[02:02:10] Project: 5766 (Run 7, Clone 42, Gen 1754)


[02:02:10] + Attempting to send results [February 10 02:02:10 UTC]
[02:02:11] + Results successfully sent
[02:02:11] Thank you for your contribution to Folding@Home.
[02:02:11] + Number of Units Completed: 51

[02:02:15] - Preparing to get new work unit...
[02:02:15] + Attempting to get work packet
[02:02:15] - Connecting to assignment server
[02:02:16] - Successful: assigned to (171.64.65.71).
[02:02:16] + News From Folding@Home: Welcome to Folding@Home
[02:02:16] Loaded queue successfully.
[02:44:16] + Could not connect to Work Server
[02:44:16] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[02:44:36] + Attempting to get work packet
[02:44:36] - Connecting to assignment server
[02:44:36] - Successful: assigned to (171.64.65.71).
[02:44:36] + News From Folding@Home: Welcome to Folding@Home
[02:44:37] Loaded queue successfully.
[03:26:37] + Could not connect to Work Server
[03:26:37] - Attempt #2  to get work failed, and no other work to do.
Waiting before retry.
[03:26:55] + Attempting to get work packet
[03:26:55] - Connecting to assignment server
[03:26:55] - Successful: assigned to (171.64.65.71).
[03:26:55] + News From Folding@Home: Welcome to Folding@Home
[03:26:55] Loaded queue successfully.
[03:26:56] + Closed connections
[03:26:56]
[03:26:56] + Processing work unit
[03:26:56] Core required: FahCore_11.exe
[03:26:56] Core found.
[03:26:56] Working on queue slot 02 [February 10 03:26:56 UTC]
[03:26:56] + Working ...
[03:26:57]
[03:26:57] *------------------------------*
[03:26:57] Folding@Home GPU Core
[03:26:57] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[03:26:57]
[03:26:57] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:26:57] Build host: amoeba
[03:26:57] Board Type: Nvidia
[03:26:57] Core      :
[03:26:57] Preparing to commence simulation
[03:26:57] - Looking at optimizations...
[03:26:57] DeleteFrameFiles: successfully deleted file=work/wudata_02.ckp
[03:26:57] - Created dyn
[03:26:57] - Files status OK
[03:26:57] - Expanded 88536 -> 447307 (decompressed 505.2 percent)
[03:26:57] Called DecompressByteArray: compressed_data_size=88536 data_size=447307, decompressed_data_size=447307 diff=0
[03:26:57] - Digital signature verified
[03:26:57]
[03:26:57] Project: 10102 (Run 909, Clone 6, Gen 5)
[03:26:57]
[03:26:57] Assembly optimizations on if available.
[03:26:57] Entering M.D.
[03:27:03] Tpr hash work/wudata_02.tpr:  3264335129 2372887907 3331789372 950203736 1369694490
[03:27:03]
[03:27:03] Calling fah_main args: 14 usage=100
[03:27:03]
[03:27:03] Working on p10102_lambda_370K
[03:27:05] Client config found, loading data.
[03:27:05] Starting GUI Server
[03:28:24] Completed 1%
[03:29:42] Completed 2%

//////////////Computer Blue screens with a variety of Driver/hardware codes. Ex; 0x0000000A/////////////

Reboot > Chkdsk > Client continuies
/////////////////////////////////////////////////////////////////////////////////////////////////////////

--- Opening Log file [February 10 03:38:54 UTC]


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Users\Leo\AppData\Roaming\Folding@home-gpu
Arguments: -gpu 0

[03:38:54] - Ask before connecting: No
[03:38:54] - User ID: 6036341656A918C4
[03:38:54] - Machine ID: 1
[03:38:55]
[03:38:55] Loaded queue successfully.
[03:38:56] Initialization complete
[03:38:56]
[03:38:56] + Processing work unit
[03:38:56] Core required: FahCore_11.exe
[03:38:56] Core found.
[03:38:57] Working on queue slot 02 [February 10 03:38:57 UTC]
[03:38:57] + Working ...
[03:39:01]
[03:39:01] *------------------------------*
[03:39:02] Folding@Home GPU Core
[03:39:02] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[03:39:02]
[03:39:02] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:39:03] Build host: amoeba
[03:39:03] Board Type: Nvidia
[03:39:03] Core      :
[03:39:03] Preparing to commence simulation
[03:39:04] - Ensuring status. Please wait.
[03:39:11] - Looking at optimizations...
[03:39:12] - Working with standard loops on this execution.
[03:39:12] - Previous termination of core was improper.
[03:39:12] - Files status OK
[03:39:13] - Expanded 88536 -> 447307 (decompressed 505.2 percent)
[03:39:13] Called DecompressByteArray: compressed_data_size=88536 data_size=447307, decompressed_data_size=447307 diff=0
[03:39:15] - Digital signature verified
[03:39:19]
[03:39:20] Project: 10102 (Run 909, Clone 6, Gen 5)
[03:39:23]
[03:39:23] Entering M.D.
[03:39:31] Will resume from checkpoint file
[03:39:33] Tpr hash work/wudata_02.tpr:  3264335129 2372887907 3331789372 950203736 1369694490
[03:39:40]
[03:39:41] Calling fah_main args: 14 usage=100
[03:39:43]
[03:39:45] Working on p10102_lambda_370K
[03:39:51] Client config found, loading data.
[03:39:53] Starting GUI Server
[03:39:53] Resuming from checkpoint
[03:39:53] fcCheckPointResume: retreived and current tpr file hash:
[03:39:53]    0   3264335129   3264335129
[03:39:53]    1   2372887907   2372887907
[03:39:53]    2   3331789372   3331789372
[03:39:53]    3    950203736    950203736
[03:39:53]    4   1369694490   1369694490
[03:39:53] fcCheckPointResume: file hashes same.
[03:39:53] fcCheckPointResume: state restored.
[03:39:53] Verified work/wudata_02.log
[03:39:53] Verified work/wudata_02.edr
[03:39:53] Verified work/wudata_02.xtc
[03:39:53] Completed 2%
[03:41:12] Completed 3%
User avatar
Leoslocks
 
Posts: 389
Joined: Fri Jan 25, 2008 3:20 am

Re: Blue screen @ WU end/beginning

Postby toTOW » Wed Feb 10, 2010 8:56 am

Since p1010x are pushing the hardware harder than other projects, I'd check the card temperature and fan speed to make sure everything is fine ...

Which drivers do you use ?

P.S : blues screens while gaming doesn't really surprise me (that's why I always advice to pause the clients when gaming) ... but if your card failed when only folding, there's is probably something wrong :(
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8776
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Blue screen @ WU end/beginning

Postby Leoslocks » Thu Feb 11, 2010 2:20 am

Using 191.07 Drivers
The card does not blue screen while gaming. It halts just after the WU completes. The machine worked will gaming and folding at the same time. After I quit gaming and go to restart the SysTry I see it is still running. Thirty minutes later while browsing the internet, Blue Screen. Every blue screen has been at the end of the WU or at the start. Fans are working and I removed and blew out the cards.
Does NVidia have an option for adjusting the fan speed?

Ran 1K Iterations on MemtestG80 258MB while folding and 0 errors.
User avatar
Leoslocks
 
Posts: 389
Joined: Fri Jan 25, 2008 3:20 am

Re: Blue screen @ WU end/beginning

Postby shdbcamping » Thu Feb 11, 2010 6:33 am

Leoslocks wrote:Using 191.07 Drivers
The card does not blue screen while gaming. It halts just after the WU completes. The machine worked will gaming and folding at the same time. After I quit gaming and go to restart the SysTry I see it is still running. Thirty minutes later while browsing the internet, Blue Screen. Every blue screen has been at the end of the WU or at the start. Fans are working and I removed and blew out the cards.
Does NVidia have an option for adjusting the fan speed?

Ran 1K Iterations on MemtestG80 258MB while folding and 0 errors.

toTOW, is correct in the temp advice.

I'm including a link to a similar heat problem with the 1010x WU's. ( viewtopic.php?f=52&t=13229#p129232) Use HWmonitor or EVGA Precision to watch the GPU temps and post back. My experience has been that these WU's (empirically) have operated 10-15C higher than every other WU on NVidia Hardware.

I'm interested in knowing what other WU temps are on you GPU setups are.

Sean
shdbcamping
 
Posts: 519
Joined: Mon Nov 10, 2008 7:57 am

Re: Blue screen @ WU end/beginning

Postby JimF » Thu Feb 11, 2010 11:45 am

Leoslocks wrote:My experience has been that these WU's (empirically) have operated 10-15C higher than every other WU on NVidia Hardware.

On my GT240's, the temps only rise about 5C (from maybe 66 to 71) on the 1010x WU's. I am getting around 4000 ppd on them, not as much as the higher-end cards, but they are very energy-efficient.
GTX 970 (i5-3550), GTX 980 (i7-3770); Win10 64-bit; FAH 7.4.4
JimF
 
Posts: 490
Joined: Thu Jan 21, 2010 2:03 pm

Re: Blue screen @ WU end/beginning

Postby Leoslocks » Sat Feb 13, 2010 12:08 am

Updating the drivers to 196.21 seems to have helped.

Reinstalled the second card and both are folding again. Trying to supplement the heating of my room with this and another dual card machine. What do you use to adjust the fan speed on nVidia cards?
User avatar
Leoslocks
 
Posts: 389
Joined: Fri Jan 25, 2008 3:20 am

Re: Blue screen @ WU end/beginning

Postby BuddhaChu » Sat Feb 13, 2010 12:12 am

Leoslocks wrote:What do you use to adjust the fan speed on nVidia cards?


RivaTuner

http://forums.nvidia.com/lofiversion/in ... 14439.html

http://www.guru3d.com/index.php?page=rivatuner
BuddhaChu
 
Posts: 149
Joined: Wed Apr 16, 2008 2:38 am

Re: Blue screen @ WU end/beginning

Postby shdbcamping » Thu Feb 18, 2010 7:47 am

JimF wrote:
shdbcamping wrote:
Leoslocks wrote:My experience has been that these WU's (empirically) have operated 10-15C higher than every other WU on NVidia Hardware.
edited by Mod.

On my GT240's, the temps only rise about 5C (from maybe 66 to 71) on the 1010x WU's. I am getting around 4000 ppd on them, not as much as the higher-end cards, but they are very energy-efficient.

I'm glad you are not experiencing my particular predicament :) . Please leave my quote out as it appears combative to me. Please simply post your particular experience and leave the IMO "...calling me a liar" stuff out of it.
Thanx in advance,
Sean
shdbcamping
 
Posts: 519
Joined: Mon Nov 10, 2008 7:57 am

Re: Blue screen @ WU end/beginning

Postby shdbcamping » Thu Feb 18, 2010 7:52 am

Leoslocks wrote:Updating the drivers to 196.21 seems to have helped.

Reinstalled the second card and both are folding again. Trying to supplement the heating of my room with this and another dual card machine. What do you use to adjust the fan speed on nVidia cards?

I am running all NV cards from 8800GT to GTX295. I use EVGA's Precision and seems to be the most stable and least hastle... at least for me... YMMV :D
All updates and versions work for all my NV HW.

Hope this helps,
Sean
shdbcamping
 
Posts: 519
Joined: Mon Nov 10, 2008 7:57 am

Re: Blue screen @ WU end/beginning

Postby Arnette » Tue Feb 23, 2010 2:07 pm

I would suggest EVGA precision. Rivatuner hasn't been updated in quite a while and does not properly support the 196.21 drivers. From every system i've run rivatuner on with those drivers, it disables the ability to modify shader clocks in the OC settings.

Also, EVGA precision is much more user friendly :)
Our Folding@Home Teampage --> http://www.lbsfolding.info
User avatar
Arnette
 
Posts: 108
Joined: Wed Jan 27, 2010 1:30 pm
Location: Ontario, Canada


Return to NVIDIA specific issues

Who is online

Users browsing this forum: No registered users and 1 guest

cron