[Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Moderators: slegrand, Site Moderators, PandeGroup

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby khurios2000 » Tue Jan 20, 2009 1:25 am

•Failing project
511 Project: 5765 (Run 3, Clone 359, Gen 10)

•Failing hardware (please add the exact GPU designation if you know it. ie 9800GTX+)
9600GSO 384MB

•Failing OS
Windows 7 Build 7000

•Failing driver
185.20
khurios2000
 
Posts: 5
Joined: Sun Jun 22, 2008 5:18 pm

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby heimie » Tue Jan 20, 2009 1:52 am

Mine seems to wonder from one GPU to the other in random form. It seems the only sure fire way to get it to go again is to delete and replace the Cudart.dll. On 3 occasions today, I deleted the whole folder with the exception of the config and cudart.dll and got the EUE's as soon as I started it back running. Replaced only the Cudart.dll and off I went. This was early this morning. got home and had one GPU on this machine and 1 on my folder that has been going for 2 months non-stop with no EUEs. Tried it on that machine as well and only replacing the Cudart.dll got it running again.

This Machine - E8400 - Vista 64bit- 680i Mobo - 2x 9800GTX+ (181.20) - (all stock speeds) Heat is not an issue as the temps never go over 60c on the GPUs and 30c on the CPU.
WU that Failed this machine:
5755 (Run 10, Clone 195, Gen 12)
5755 (Run 10, Clone 195, Gen 12) (again after 2 completed WUs and it continued until "EUE limit exceeded. Pausing 24 hours.")
this Mornings failure -
5766 (Run 0, Clone 284, Gen 29)
8 WUs worked fine then -
5766 (Run 13, Clone 374, Gen 4) until limit reached and core shit down.

Other machine - E6850 - Vista64 bit - 680i Mobo - 3x8800GT (177.83) Heat on the GPUs never go over 55C 32c on the CPU.

WU that Failed this machine:

5771 (Run 12, Clone 48, Gen 6) (Only 1 on this machine thus far)
heimie
 
Posts: 79
Joined: Sat Jun 14, 2008 10:17 am
Location: Lockport, Louisiana

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby Leonardo » Tue Jan 20, 2009 4:29 am

Several failures in a row - all but one work unit did not even even complete one frame
System - XP 32 SP3
GPU - G92 on 9800GX2
Drivers - 180.60

Summary follows:

Failed work units:

5750/14/286/10 - 7 times in a row
5766/12/302/3 - 3 in a row (one run completed 14 frames, the other two no frames)
5771/12/301/10 - 2 in a row

NOTE: there is slight possibility, but not probable, that an insufficient CPU vCore might have contributed to this series of duds. About 36 hours ago I lowered the vCore (CPU) one notch. With that said, I had not experienced crashes of work units on this machine after vCore setting change until the documented failures in the log below. There are four GPU cores and a Q6600 quad core CPU all running clients on this machine. All the reported work unit EUEs occurred within less than an hour on the same GPU core. Previously that core's client was reliable.

Code: Select all
[03:22:19] Project: 5771 (Run 12, Clone 301, Gen 10)
[03:22:19]
[03:22:19] Assembly optimizations on if available.
[03:22:19] Entering M.D.
[03:22:28] Working on Protein
[03:22:28] mdrun_gpu returned
[03:22:28] Self-test failure
[03:22:28]
[03:22:28] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:22:33] CoreStatus = 7A (122)
[03:22:33] Sending work to server
[03:22:33] Project: 5771 (Run 12, Clone 301, Gen 10)
[03:22:33] - Read packet limit of 540015616... Set to 524286976.
[03:22:33] - Error: Could not get length of results file work/wuresults_09.dat
[03:22:33] - Error: Could not read unit 09 file. Removing from queue.
[03:22:33] EUE limit exceeded. Pausing 24 hours.//////user edits for brevity///

[03:02:48] Folding@Home GPU Core - Beta
[03:02:48] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:02:48]
[03:02:48] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:02:48] Build host: amoeba
[03:02:48] Board Type: Nvidia
[03:02:48] Core      :
[03:02:48] Preparing to commence simulation
[03:02:48] - Looking at optimizations...
[03:02:48] - Created dyn
[03:02:48] - Files status OK
[03:02:48] - Expanded 46545 -> 252912 (decompressed 543.3 percent)
[03:02:48] Called DecompressByteArray: compressed_data_size=46545 data_size=252912, decompressed_data_size=252912 diff=0
[03:02:48] - Digital signature verified
[03:02:48]
[03:02:48] Project: 5766 (Run 12, Clone 302, Gen 3)
[03:02:48]
[03:02:48] Assembly optimizations on if available.
[03:02:48] Entering M.D.
[03:02:55] Working on Protein
[03:02:56] Client config found, loading data.
[03:02:56] Starting GUI Server
[03:04:02] Completed 1%
[03:05:30] Completed 2%
[03:06:39] Completed 3%
[03:07:52] Completed 4%
[03:09:15] Completed 5%
[03:10:21] Completed 6%
[03:11:51] Completed 7%
[03:12:57] Completed 8%
[03:14:39] Completed 9%
[03:15:47] Completed 10%
[03:17:10] Completed 11%
[03:18:23] Completed 12%
[03:19:30] Completed 13%
[03:21:12] Completed 14%
[03:21:12] mdrun_gpu returned
[03:21:12] NANs detected on GPU
[03:21:12]
[03:21:12] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:21:16] CoreStatus = 7A (122)
[03:21:16] Sending work to server
[03:21:16] Project: 5766 (Run 12, Clone 302, Gen 3)
[03:21:16] - Read packet limit of 540015616... Set to 524286976.
[03:21:16] - Error: Could not get length of results file work/wuresults_05.dat
[03:21:16] - Error: Could not read unit 05 file. Removing from queue.
[03:21:16] Trying to send all finished work units
[03:21:16] + No unsent completed units remaining.
[03:21:16] - Preparing to get new work unit...
[03:21:16] + Attempting to get work packet
[03:21:16] - Will indicate memory of 2046 MB
[03:21:16] - Connecting to assignment server
[03:21:16] Connecting to http://assign-GPU.stanford.edu:8080/
[03:21:16] Posted data.
[03:21:16] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:21:16] + News From Folding@Home: GPU folding beta
[03:21:16] Loaded queue successfully.
[03:21:16] Connecting to http://171.67.108.11:8080/
[03:21:17] Posted data.
[03:21:17] Initial: 0000; - Receiving payload (expected size: 99328)
[03:21:18] - Downloaded at ~97 kB/s
[03:21:18] - Averaged speed for that direction ~77 kB/s
[03:21:18] + Received work.
[03:21:18] Trying to send all finished work units
[03:21:18] + No unsent completed units remaining.
[03:21:18] + Closed connections
[03:21:23]
[03:21:23] + Processing work unit
[03:21:23] Core required: FahCore_11.exe
[03:21:23] Core found.
[03:21:23] Working on queue slot 06 [January 20 03:21:23 UTC]
[03:21:23] + Working ...
[03:21:23] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -checkpoint 30 -verbose -lifeline 3408 -version 623'

[03:21:23]
[03:21:23] *------------------------------*
[03:21:23] Folding@Home GPU Core - Beta
[03:21:23] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:21:23]
[03:21:23] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:21:23] Build host: amoeba
[03:21:23] Board Type: Nvidia
[03:21:23] Core      :
[03:21:23] Preparing to commence simulation
[03:21:23] - Looking at optimizations...
[03:21:23] - Created dyn
[03:21:23] - Files status OK
[03:21:23] - Expanded 98816 -> 492276 (decompressed 498.1 percent)
[03:21:23] Called DecompressByteArray: compressed_data_size=98816 data_size=492276, decompressed_data_size=492276 diff=0
[03:21:23] - Digital signature verified
[03:21:23]
[03:21:23] Project: 5749 (Run 0, Clone 370, Gen 8)
[03:21:23]
[03:21:23] Assembly optimizations on if available.
[03:21:23] Entering M.D.
[03:21:30] Working on Protein
[03:21:31] mdrun_gpu returned
[03:21:31] Self-test failure
[03:21:31]
[03:21:31] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:21:33] CoreStatus = 7A (122)
[03:21:33] Sending work to server
[03:21:33] Project: 5749 (Run 0, Clone 370, Gen 8)
[03:21:33] - Read packet limit of 540015616... Set to 524286976.
[03:21:33] - Error: Could not get length of results file work/wuresults_06.dat
[03:21:33] - Error: Could not read unit 06 file. Removing from queue.
[03:21:33] Trying to send all finished work units
[03:21:33] + No unsent completed units remaining.
[03:21:33] - Preparing to get new work unit...
[03:21:33] + Attempting to get work packet
[03:21:33] - Will indicate memory of 2046 MB
[03:21:33] - Connecting to assignment server
[03:21:33] Connecting to http://assign-GPU.stanford.edu:8080/
[03:21:33] Posted data.
[03:21:33] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:21:33] + News From Folding@Home: GPU folding beta
[03:21:33] Loaded queue successfully.
[03:21:33] Connecting to http://171.67.108.11:8080/
[03:21:34] Posted data.
[03:21:34] Initial: 0000; - Receiving payload (expected size: 99201)
[03:21:34] Conversation time very short, giving reduced weight in bandwidth avg
[03:21:34] - Downloaded at ~193 kB/s
[03:21:34] - Averaged speed for that direction ~90 kB/s
[03:21:34] + Received work.
[03:21:34] Trying to send all finished work units
[03:21:34] + No unsent completed units remaining.
[03:21:34] + Closed connections
[03:21:39]
[03:21:39] + Processing work unit
[03:21:39] Core required: FahCore_11.exe
[03:21:39] Core found.
[03:21:39] Working on queue slot 07 [January 20 03:21:39 UTC]
[03:21:39] + Working ...
[03:21:39] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -checkpoint 30 -verbose -lifeline 3408 -version 623'

[03:21:40]
[03:21:40] *------------------------------*
[03:21:40] Folding@Home GPU Core - Beta
[03:21:40] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:21:40]
[03:21:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:21:40] Build host: amoeba
[03:21:40] Board Type: Nvidia
[03:21:40] Core      :
[03:21:40] Preparing to commence simulation
[03:21:40] - Looking at optimizations...
[03:21:40] - Created dyn
[03:21:40] - Files status OK
[03:21:40] - Expanded 98689 -> 492276 (decompressed 498.8 percent)
[03:21:40] Called DecompressByteArray: compressed_data_size=98689 data_size=492276, decompressed_data_size=492276 diff=0
[03:21:40] - Digital signature verified
[03:21:40]
[03:21:40] Project: 5749 (Run 2, Clone 338, Gen 7)
[03:21:40]
[03:21:40] Assembly optimizations on if available.
[03:21:40] Entering M.D.
[03:21:48] Working on Protein
[03:21:48] mdrun_gpu returned
[03:21:48] Self-test failure
[03:21:48]
[03:21:48] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:21:52] CoreStatus = 7A (122)
[03:21:52] Sending work to server
[03:21:52] Project: 5749 (Run 2, Clone 338, Gen 7)
[03:21:52] - Read packet limit of 540015616... Set to 524286976.
[03:21:52] - Error: Could not get length of results file work/wuresults_07.dat
[03:21:52] - Error: Could not read unit 07 file. Removing from queue.
[03:21:52] Trying to send all finished work units
[03:21:52] + No unsent completed units remaining.
[03:21:52] - Preparing to get new work unit...
[03:21:52] + Attempting to get work packet
[03:21:52] - Will indicate memory of 2046 MB
[03:21:52] - Connecting to assignment server
[03:21:52] Connecting to http://assign-GPU.stanford.edu:8080/
[03:21:52] Posted data.
[03:21:52] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:21:52] + News From Folding@Home: GPU folding beta
[03:21:52] Loaded queue successfully.
[03:21:52] Connecting to http://171.67.108.11:8080/
[03:21:53] Posted data.
[03:21:53] Initial: 0000; - Receiving payload (expected size: 99241)
[03:21:55] - Downloaded at ~48 kB/s
[03:21:55] - Averaged speed for that direction ~81 kB/s
[03:21:55] + Received work.
[03:21:55] Trying to send all finished work units
[03:21:55] + No unsent completed units remaining.
[03:21:55] + Closed connections
[03:22:00]
[03:22:00] + Processing work unit
[03:22:00] Core required: FahCore_11.exe
[03:22:00] Core found.
[03:22:00] Working on queue slot 08 [January 20 03:22:00 UTC]
[03:22:00] + Working ...
[03:22:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 08 -checkpoint 30 -verbose -lifeline 3408 -version 623'

[03:22:00]
[03:22:00] *------------------------------*
[03:22:00] Folding@Home GPU Core - Beta
[03:22:00] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:22:00]
[03:22:00] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:22:00] Build host: amoeba
[03:22:00] Board Type: Nvidia
[03:22:00] Core      :
[03:22:00] Preparing to commence simulation
[03:22:00] - Looking at optimizations...
[03:22:00] - Created dyn
[03:22:00] - Files status OK
[03:22:00] - Expanded 98729 -> 492276 (decompressed 498.6 percent)
[03:22:00] Called DecompressByteArray: compressed_data_size=98729 data_size=492276, decompressed_data_size=492276 diff=0
[03:22:00] - Digital signature verified
[03:22:00]
[03:22:00] Project: 5750 (Run 14, Clone 286, Gen 10)
[03:22:00]
[03:22:00] Assembly optimizations on if available.
[03:22:00] Entering M.D.
[03:22:08] Working on Protein
[03:22:08] mdrun_gpu returned
[03:22:08] Self-test failure
[03:22:08]
[03:22:08] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:22:12] CoreStatus = 7A (122)
[03:22:12] Sending work to server
[03:22:12] Project: 5750 (Run 14, Clone 286, Gen 10)
[03:22:12] - Read packet limit of 540015616... Set to 524286976.
[03:22:12] - Error: Could not get length of results file work/wuresults_08.dat
[03:22:12] - Error: Could not read unit 08 file. Removing from queue.
[03:22:12] Trying to send all finished work units
[03:22:12] + No unsent completed units remaining.
[03:22:12] - Preparing to get new work unit...
[03:22:12] + Attempting to get work packet
[03:22:12] - Will indicate memory of 2046 MB
[03:22:12] - Connecting to assignment server
[03:22:12] Connecting to http://assign-GPU.stanford.edu:8080/
[03:22:12] Posted data.
[03:22:12] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:22:12] + News From Folding@Home: GPU folding beta
[03:22:12] Loaded queue successfully.
[03:22:12] Connecting to http://171.67.108.11:8080/
[03:22:13] Posted data.
[03:22:13] Initial: 0000; - Receiving payload (expected size: 45880)
[03:22:14] - Downloaded at ~44 kB/s
[03:22:14] - Averaged speed for that direction ~74 kB/s
[03:22:14] + Received work.
[03:22:14] Trying to send all finished work units
[03:22:14] + No unsent completed units remaining.
[03:22:14] + Closed connections
[03:22:19]
[03:22:19] + Processing work unit
[03:22:19] Core required: FahCore_11.exe
[03:22:19] Core found.
[03:22:19] Working on queue slot 09 [January 20 03:22:19 UTC]
[03:22:19] + Working ...
[03:22:19] - Calling '.\FahCore_11.exe -dir work/ -suffix 09 -checkpoint 30 -verbose -lifeline 3408 -version 623'

[03:22:19]
[03:22:19] *------------------------------*
[03:22:19] Folding@Home GPU Core - Beta
[03:22:19] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:22:19]
[03:22:19] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:22:19] Build host: amoeba
[03:22:19] Board Type: Nvidia
[03:22:19] Core      :
[03:22:19] Preparing to commence simulation
[03:22:19] - Looking at optimizations...
[03:22:19] - Created dyn
[03:22:19] - Files status OK
[03:22:19] - Expanded 45368 -> 251112 (decompressed 553.5 percent)
[03:22:19] Called DecompressByteArray: compressed_data_size=45368 data_size=251112, decompressed_data_size=251112 diff=0
[03:22:19] - Digital signature verified
[03:22:19]
[03:22:19] Project: 5771 (Run 12, Clone 301, Gen 10)
[03:22:19]
[03:22:19] Assembly optimizations on if available.
[03:22:19] Entering M.D.
[03:22:28] Working on Protein
[03:22:28] mdrun_gpu returned
[03:22:28] Self-test failure
[03:22:28]
[03:22:28] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:22:33] CoreStatus = 7A (122)
[03:22:33] Sending work to server
[03:22:33] Project: 5771 (Run 12, Clone 301, Gen 10)
[03:22:33] - Read packet limit of 540015616... Set to 524286976.
[03:22:33] - Error: Could not get length of results file work/wuresults_09.dat
[03:22:33] - Error: Could not read unit 09 file. Removing from queue.
[03:22:33] EUE limit exceeded. Pausing 24 hours.
User avatar
Leonardo
 
Posts: 597
Joined: Tue Dec 04, 2007 5:09 am
Location: Eagle River, Alaska

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby khurios2000 » Tue Jan 20, 2009 11:19 am

Code: Select all
6:39] Preparing to commence simulation
[09:36:39] - Looking at optimizations...
[09:36:39] - Created dyn
[09:36:39] - Files status OK
[09:36:39] - Expanded 46736 -> 252912 (decompressed 541.1 percent)
[09:36:39] Called DecompressByteArray: compressed_data_size=46736 data_size=252912, decompressed_data_size=252912 diff=0
[09:36:39] - Digital signature verified
[09:36:39]
[09:36:39] Project: 5765 (Run 9, Clone 395, Gen 5)
[09:36:39]
[09:36:39] Assembly optimizations on if available.
[09:36:39] Entering M.D.
[09:36:49] Working on Protein
[09:36:50] Client config found, loading data.
[09:36:50] Starting GUI Server
[09:36:50] mdrun_gpu returned
[09:36:50] NANs detected on GPU
[09:36:50]
[09:36:50] Folding@home Core Shutdown: UNSTABLE_MACHINE
[09:36:54] CoreStatus = 7A (122)
[09:36:56] Sending work to server
[09:36:56] Project: 5765 (Run 9, Clone 395, Gen 5)
[09:36:56] - Error: Could not get length of results file work/wuresults_03.dat
[09:36:56] - Error: Could not read unit 03 file. Removing from queue.
[09:36:56] Trying to send all finished work units
[09:36:56] + No unsent completed units remaining.
[09:36:56] - Preparing to get new work unit...
[09:36:56] + Attempting to get work packet
[09:36:56] - Will indicate memory of 2046 MB
[09:36:56] - Connecting to assignment server
[09:36:56] Connecting to http://assign-GPU.stanford.edu:8080/
[09:36:56] Posted data.
[09:36:56] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[09:36:56] + News From Folding@Home: GPU folding beta
[09:36:56] Loaded queue successfully.
[09:36:56] Connecting to http://171.67.108.11:8080/
[09:36:57] Posted data.
[09:36:57] Initial: 0000; - Receiving payload (expected size: 47248)
[09:36:57] Conversation time very short, giving reduced weight in bandwidth avg
[09:36:57] - Downloaded at ~92 kB/s
[09:36:57] - Averaged speed for that direction ~64 kB/s
[09:36:57] + Received work.
[09:36:57] Trying to send all finished work units
[09:36:57] + No unsent completed units remaining.
[09:36:57] + Closed connections
[09:37:02]
[09:37:02] + Processing work unit
[09:37:02] Core required: FahCore_11.exe
[09:37:02] Core found.
[09:37:02] Working on queue slot 04 [January 20 09:37:02 UTC]
[09:37:02] + Working ...
[09:37:02] - Calling '.\FahCore_11.exe -dir work/ -suffix 04 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 3476 -version 623'

[09:37:03]
[09:37:03] *------------------------------*
[09:37:03] Folding@Home GPU Core - Beta
[09:37:03] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[09:37:03]
[09:37:03] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[09:37:03] Build host: amoeba
[09:37:03] Board Type: Nvidia
[09:37:03] Core      :
[09:37:03] Preparing to commence simulation
[09:37:03] - Looking at optimizations...
[09:37:03] - Created dyn
[09:37:03] - Files status OK
[09:37:03] - Expanded 46736 -> 252912 (decompressed 541.1 percent)
[09:37:03] Called DecompressByteArray: compressed_data_size=46736 data_size=252912, decompressed_data_size=252912 diff=0
[09:37:03] - Digital signature verified
[09:37:03]
[09:37:03] Project: 5765 (Run 9, Clone 395, Gen 5)
[09:37:03]
[09:37:03] Assembly optimizations on if available.
[09:37:03] Entering M.D.
[09:37:12] Working on Protein
[09:37:13] Client config found, loading data.
[09:37:13] Starting GUI Server
[09:37:13] mdrun_gpu returned
[09:37:13] NANs detected on GPU
[09:37:13]
[09:37:13] Folding@home Core Shutdown: UNSTABLE_MACHINE
[09:37:17] CoreStatus = 7A (122)
[09:37:17] Sending work to server
[09:37:17] Project: 5765 (Run 9, Clone 395, Gen 5)
[09:37:17] - Error: Could not get length of results file work/wuresults_04.dat
[09:37:17] - Error: Could not read unit 04 file. Removing from queue.
[09:37:17] EUE limit exceeded. Pausing 24 hours.
[10:05:09] - Autosending finished units... [January 20 10:05:09 UTC]
[10:05:09] Trying to send all finished work units
[10:05:09] + No unsent completed units remaining.
[10:05:09] - Autosend completed
[10:05:09] + Working...


Code: Select all
:58] - Looking at optimizations...
[04:42:58] - Created dyn
[04:42:58] - Files status OK
[04:42:58] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[04:42:58] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[04:42:58] - Digital signature verified
[04:42:58]
[04:42:58] Project: 5755 (Run 10, Clone 195, Gen 12)
[04:42:58]
[04:42:58] Assembly optimizations on if available.
[04:42:58] Entering M.D.
[04:43:08] Working on Protein
[04:43:11] Client config found, loading data.
[04:43:11] Starting GUI Server
[04:43:12] mdrun_gpu returned
[04:43:12] SHAKE violations on GPU
[04:43:12]
[04:43:12] Folding@home Core Shutdown: UNSTABLE_MACHINE
[04:43:14] CoreStatus = 7A (122)
[04:43:14] Sending work to server
[04:43:14] Project: 5755 (Run 10, Clone 195, Gen 12)
[04:43:14] - Error: Could not get length of results file work/wuresults_06.dat
[04:43:14] - Error: Could not read unit 06 file. Removing from queue.
[04:43:14] Trying to send all finished work units
[04:43:14] + No unsent completed units remaining.
[04:43:14] - Preparing to get new work unit...
[04:43:14] + Attempting to get work packet
[04:43:14] - Will indicate memory of 2046 MB
[04:43:14] - Connecting to assignment server
[04:43:14] Connecting to http://assign-GPU.stanford.edu:8080/
[04:43:15] Posted data.
[04:43:15] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[04:43:15] + News From Folding@Home: GPU folding beta
[04:43:15] Loaded queue successfully.
[04:43:15] Connecting to http://171.67.108.11:8080/
[04:43:16] Posted data.
[04:43:16] Initial: 0000; - Receiving payload (expected size: 97037)
[04:43:18] - Downloaded at ~47 kB/s
[04:43:18] - Averaged speed for that direction ~39 kB/s
[04:43:18] + Received work.
[04:43:18] Trying to send all finished work units
[04:43:18] + No unsent completed units remaining.
[04:43:18] + Closed connections
[04:43:23]
[04:43:23] + Processing work unit
[04:43:23] Core required: FahCore_11.exe
[04:43:23] Core found.
[04:43:23] Working on queue slot 07 [January 20 04:43:23 UTC]
[04:43:23] + Working ...
[04:43:23] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 2220 -version 623'

[04:43:23]
[04:43:23] *------------------------------*
[04:43:23] Folding@Home GPU Core - Beta
[04:43:23] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[04:43:23]
[04:43:23] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[04:43:23] Build host: amoeba
[04:43:23] Board Type: Nvidia
[04:43:23] Core      :
[04:43:23] Preparing to commence simulation
[04:43:23] - Looking at optimizations...
[04:43:23] - Created dyn
[04:43:23] - Files status OK
[04:43:23] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[04:43:23] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[04:43:23] - Digital signature verified
[04:43:23]


* Failing projects

[*] 353 points WUs (Project: 5765 (Run 9, Clone 395, Gen 5)
[*] 511 points WUs Project: 5755 (Run 10, Clone 195, Gen 12)

* Failing hardware

[*] 8800GT 512MB

* Failing OS

[*] Windows Vista 64 bit

* Failing drivers

181.20
khurios2000
 
Posts: 5
Joined: Sun Jun 22, 2008 5:18 pm

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby heimie » Tue Jan 20, 2009 11:47 am

*Update*

Going throught my F@HMon Benchmarks (As most of my previous logs are gone trying to get my 9800GTX+ running consistantly), it seems the 9800GTX+ are the ones mainly having issues.

5755,5765,5766 and 5771 I have no successful results on the 9800GTX+ but several on the 8800GT's. I have noticed that the 8800GT's will get some EUE's on these WU's, but usually move on and eventually finish them. I also noted that just deleting the queue.dat and Unitinfo files allow it to draw another WU and carry on. My 9800GTX+ have never completed either of the WU's listed above. My 9800GT had the same issue with the same WU's looking at that log file. Could this be mainly on 9XXX cards?
heimie
 
Posts: 79
Joined: Sat Jun 14, 2008 10:17 am
Location: Lockport, Louisiana

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby dark55 » Tue Jan 20, 2009 4:06 pm

EDIT: My bad, toTOW :oops:
Last edited by dark55 on Wed Jan 21, 2009 11:59 pm, edited 1 time in total.
dark55
 
Posts: 6
Joined: Wed Jun 04, 2008 12:53 am

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby toTOW » Tue Jan 20, 2009 9:09 pm

dark55, this is a known bad WU : viewtopic.php?f=19&t=7657 (I've seen another report in this thread ...)

If it is the only WU failing on your board, you don't have to worry about it.

I updated the first post with the following paragraph to describe this situation :

Before posting any report, and if you're only seeing issue on an individual WU (it can fail up to 6 times in a row before moving to another one), please check if it has already been reported as a bad WU in this forum : viewforum.php?f=19 ... if you're having issue with multiple WUs (different Project/Run/Clone/Gen numbers), please do what is described below.
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8454
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby bruce » Tue Jan 20, 2009 10:38 pm

extrasalty wrote:Since all the results are reported to central server, isn't it possible for PG to monitor those NANs automatically? In either case we probably need a tool to collect the needed data automatically.


Most failures are reported to the server, but I've seen a small percentage that don't get reported (although they're probably not this particular problem). The error reports that toTOW is asking for include some important information about your installation that's probably not being reported to the server. If that's the critical information that allows the bug to be isolated/identified/fixed, then the manual reports are very important.
bruce
 
Posts: 21534
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby khurios2000 » Tue Jan 20, 2009 11:20 pm

* Failing projects
o 353 points [Wus Project: 5766 (Run 12, Clone 416, Gen 0)] [Project: 5766 (Run 12, Clone 416, Gen 0)Project: 5765 (Run 2, Clone 425, Gen 0)]Project: 5765 (Run 13, Clone 406, Gen 9)Project: 5765 (Run 2, Clone 425, Gen 0)
* Failing hardware
o 8800 GS 384MB
* Failing OS
o Windows Vista 64 bits
* Failing drivers
o 185.20
* Comments (add below any detail you might find useful to the report)
Code: Select all
-----------------------------*
[20:34:23] Folding@Home GPU Core - Beta
[20:34:23] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[20:34:23]
[20:34:23] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[20:34:23] Build host: amoeba
[20:34:23] Board Type: Nvidia
[20:34:23] Core      :
[20:34:23] Preparing to commence simulation
[20:34:23] - Looking at optimizations...
[20:34:23] - Created dyn
[20:34:23] - Files status OK
[20:34:23] - Expanded 44124 -> 252912 (decompressed 573.1 percent)
[20:34:23] Called DecompressByteArray: compressed_data_size=44124 data_size=252912, decompressed_data_size=252912 diff=0
[20:34:23] - Digital signature verified
[20:34:23]
[20:34:23] Project: 5766 (Run 12, Clone 416, Gen 0)
[20:34:23]
[20:34:23] Assembly optimizations on if available.
[20:34:23] Entering M.D.
[20:34:32] Working on Protein
[20:34:33] Client config found, loading data.
[20:34:33] mdrun_gpu returned
[20:34:33] NANs detected on GPU
[20:34:33]
[20:34:33] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:34:38] CoreStatus = 7A (122)
[20:34:38] Sending work to server
[20:34:38] Project: 5766 (Run 12, Clone 416, Gen 0)
[20:34:38] - Error: Could not get length of results file work/wuresults_06.dat
[20:34:38] - Error: Could not read unit 06 file. Removing from queue.
[20:34:38] Trying to send all finished work units
[20:34:38] + No unsent completed units remaining.
[20:34:38] - Preparing to get new work unit...
[20:34:38] + Attempting to get work packet
[20:34:38] - Will indicate memory of 2046 MB
[20:34:38] - Connecting to assignment server
[20:34:38] Connecting to http://assign-GPU.stanford.edu:8080/
[20:34:38] Posted data.
[20:34:38] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[20:34:38] + News From Folding@Home: GPU folding beta
[20:34:39] Loaded queue successfully.
[20:34:39] Connecting to http://171.67.108.11:8080/
[20:34:39] Posted data.
[20:34:39] Initial: 0000; - Receiving payload (expected size: 44636)
[20:34:40] - Downloaded at ~43 kB/s
[20:34:40] - Averaged speed for that direction ~52 kB/s
[20:34:40] + Received work.
[20:34:40] Trying to send all finished work units
[20:34:40] + No unsent completed units remaining.
[20:34:40] + Closed connections
[20:34:45]
[20:34:45] + Processing work unit
[20:34:45] Core required: FahCore_11.exe
[20:34:45] Core found.
[20:34:45] Working on queue slot 07 [January 20 20:34:45 UTC]
[20:34:45] + Working ...
[20:34:45] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 2856 -version 623'

[20:34:45]
[20:34:45] *------------------------------*
[20:34:45] Folding@Home GPU Core - Beta
[20:34:45] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[20:34:45]
[20:34:45] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[20:34:45] Build host: amoeba
[20:34:45] Board Type: Nvidia
[20:34:45] Core      :
[20:34:45] Preparing to commence simulation
[20:34:45] - Looking at optimizations...
[20:34:45] - Created dyn
[20:34:45] - Files status OK
[20:34:45] - Expanded 44124 -> 252912 (decompressed 573.1 percent)
[20:34:45] Called DecompressByteArray: compressed_data_size=44124 data_size=252912, decompressed_data_size=252912 diff=0
[20:34:45] - Digital signature verified
[20:34:45]
[20:34:45] Project: 5766 (Run 12, Clone 416, Gen 0)
[20:34:45]
[20:34:45] Assembly optimizations on if available.
[20:34:45] Entering M.D.
[20:34:54] Working on Protein
[20:34:55] Client config found, loading data.
[20:34:55] Starting GUI Server
[20:34:55] mdrun_gpu returned
[20:34:55] NANs detected on GPU
[20:34:55]
[20:34:55] Folding@home Core Shutdown: UNSTABLE_MACHINE
[20:34:59] CoreStatus = 7A (122)
[20:34:59] Sending work to server
[20:34:59] Project: 5766 (Run 12, Clone 416, Gen 0)
[20:34:59] - Error: Could not get length of results file work/wuresults_07.dat
[20:34:59] - Error: Could not read unit 07 file. Removing from queue.
[20:34:59] EUE limit exceeded. Pausing 24 hours.


Code: Select all
+ Optimizing Compiler Version 14.00.50727.762 for 80x86
[03:03:27] Build host: amoeba
[03:03:27] Board Type: Nvidia
[03:03:27] Core      :
[03:03:27] Preparing to commence simulation
[03:03:27] - Looking at optimizations...
[03:03:27] - Created dyn
[03:03:27] - Files status OK
[03:03:27] - Expanded 43951 -> 252912 (decompressed 575.4 percent)
[03:03:27] Called DecompressByteArray: compressed_data_size=43951 data_size=252912, decompressed_data_size=252912 diff=0
[03:03:27] - Digital signature verified
[03:03:27]
[03:03:27] Project: 5765 (Run 2, Clone 425, Gen 0)
[03:03:27]
[03:03:27] Assembly optimizations on if available.
[03:03:27] Entering M.D.
[03:03:35] Working on Protein
[03:03:36] Client config found, loading data.
[03:03:36] mdrun_gpu returned
[03:03:36] NANs detected on GPU
[03:03:36]
[03:03:36] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:03:40] CoreStatus = 7A (122)
[03:03:40] Sending work to server
[03:03:40] Project: 5765 (Run 2, Clone 425, Gen 0)
[03:03:40] - Error: Could not get length of results file work/wuresults_06.dat
[03:03:40] - Error: Could not read unit 06 file. Removing from queue.
[03:03:40] Trying to send all finished work units
[03:03:40] + No unsent completed units remaining.
[03:03:40] - Preparing to get new work unit...
[03:03:40] + Attempting to get work packet
[03:03:40] - Will indicate memory of 2046 MB
[03:03:40] - Connecting to assignment server
[03:03:40] Connecting to http://assign-GPU.stanford.edu:8080/
[03:03:42] Posted data.
[03:03:42] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:03:42] + News From Folding@Home: GPU folding beta
[03:03:42] Loaded queue successfully.
[03:03:42] Connecting to http://171.67.108.11:8080/
[03:03:43] Posted data.
[03:03:43] Initial: 0000; - Error: Bad packet type from server, expected work assignment
[03:03:43] - Attempt #1  to get work failed, and no other work to do.
Waiting before retry.
[03:04:00] + Attempting to get work packet
[03:04:00] - Will indicate memory of 2046 MB
[03:04:00] - Connecting to assignment server
[03:04:00] Connecting to http://assign-GPU.stanford.edu:8080/
[03:04:02] Posted data.
[03:04:02] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:04:02] + News From Folding@Home: GPU folding beta
[03:04:02] Loaded queue successfully.
[03:04:02] Connecting to http://171.67.108.11:8080/
[03:04:04] Posted data.
[03:04:04] Initial: 0000; - Receiving payload (expected size: 45893)
[03:04:07] - Downloaded at ~14 kB/s
[03:04:07] - Averaged speed for that direction ~36 kB/s
[03:04:07] + Received work.
[03:04:08] Trying to send all finished work units
[03:04:08] + No unsent completed units remaining.
[03:04:08] + Closed connections
[03:04:13]
[03:04:13] + Processing work unit
[03:04:13] Core required: FahCore_11.exe
[03:04:13] Core found.
[03:04:13] Working on queue slot 07 [January 21 03:04:13 UTC]
[03:04:13] + Working ...
[03:04:13] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 3432 -version 623'


Code: Select all
d host: amoeba
[06:24:38] Board Type: Nvidia
[06:24:38] Core      :
[06:24:38] Preparing to commence simulation
[06:24:38] - Looking at optimizations...
[06:24:38] - Created dyn
[06:24:38] - Files status OK
[06:24:38] - Expanded 46664 -> 252912 (decompressed 541.9 percent)
[06:24:38] Called DecompressByteArray: compressed_data_size=46664 data_size=252912, decompressed_data_size=252912 diff=0
[06:24:38] - Digital signature verified
[06:24:38]
[06:24:38] Project: 5765 (Run 13, Clone 406, Gen 9)
[06:24:38]
[06:24:38] Assembly optimizations on if available.
[06:24:38] Entering M.D.
[06:24:47] Working on Protein
[06:24:48] Client config found, loading data.
[06:24:48] Starting GUI Server
[06:24:48] mdrun_gpu returned
[06:24:48] NANs detected on GPU
[06:24:48]
[06:24:48] Folding@home Core Shutdown: UNSTABLE_MACHINE
[06:24:53] CoreStatus = 7A (122)
[06:24:53] Sending work to server
[06:24:53] Project: 5765 (Run 13, Clone 406, Gen 9)
[06:24:53] - Error: Could not get length of results file work/wuresults_06.dat
[06:24:53] - Error: Could not read unit 06 file. Removing from queue.
[06:24:53] Trying to send all finished work units
[06:24:53] + No unsent completed units remaining.
[06:24:53] - Preparing to get new work unit...
[06:24:53] + Attempting to get work packet
[06:24:53] - Will indicate memory of 2046 MB
[06:24:53] - Connecting to assignment server
[06:24:53] Connecting to http://assign-GPU.stanford.edu:8080/
[06:24:54] Posted data.
[06:24:54] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[06:24:54] + News From Folding@Home: GPU folding beta
[06:24:54] Loaded queue successfully.
[06:24:54] Connecting to http://171.67.108.11:8080/
[06:24:55] Posted data.
[06:24:55] Initial: 0000; - Receiving payload (expected size: 47176)
[06:24:56] - Downloaded at ~46 kB/s
[06:24:56] - Averaged speed for that direction ~44 kB/s
[06:24:56] + Received work.
[06:24:56] Trying to send all finished work units
[06:24:56] + No unsent completed units remaining.
[06:24:56] + Closed connections
[06:25:01]
[06:25:01] + Processing work unit
[06:25:01] Core required: FahCore_11.exe
[06:25:01] Core found.
[06:25:01] Working on queue slot 07 [January 21 06:25:01 UTC]
[06:25:01] + Working ...
[06:25:01] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 2844 -version 623'

[06:25:01]
[06:25:01] *------------------------------*
[06:25:01] Folding@Home GPU Core - Beta
[06:25:01] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[06:25:01]
[06:25:01] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[06:25:01] Build host: amoeba
[06:25:01] Board Type: Nvidia
[06:25:01] Core      :
[06:25:01] Preparing to commence simulation
[06:25:01] - Looking at optimizations...
[06:25:01] - Created dyn
[06:25:01] - Files status OK
[06:25:01] - Expanded 46664 -> 252912 (decompressed 541.9 percent)
[06:25:01] Called DecompressByteArray: compressed_data_size=46664 data_size=252912, decompressed_data_size=252912 diff=0
[06:25:01] - Digital signature verified
[06:25:01]
[06:25:01] Project: 5765 (Run 13, Clone 406, Gen 9)
[06:25:01]
[06:25:01] Assembly optimizations on if available.
[06:25:01] Entering M.D.
[06:25:10] Working on Protein
[06:25:11] Client config found, loading data.
[06:25:11] Starting GUI Server
[06:25:11] mdrun_gpu returned
[06:25:11] NANs detected on GPU
[06:25:11]
[06:25:11] Folding@home Core Shutdown: UNSTABLE_MACHINE
[06:25:16] CoreStatus = 7A (122)
[06:25:16] Sending work to server
[06:25:16] Project: 5765 (Run 13, Clone 406, Gen 9)
[06:25:16] - Error: Could not get length of results file work/wuresults_07.dat
[06:25:16] - Error: Could not read unit 07 file. Removing from queue.
[06:25:16] EUE limit exceeded. Pausing 24 hours.
[11:25:25] - Autosending finished units... [January 21 11:25:25 UTC]
[11:25:25] Trying to send all finished work units
[11:25:25] + No unsent completed units remaining.
[11:25:25] - Autosend completed
[11:25:25] + Working...
khurios2000
 
Posts: 5
Joined: Sun Jun 22, 2008 5:18 pm

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby shdbcamping » Wed Jan 21, 2009 2:51 pm

All,
I do not know if this will apply to all or any.... but I had Folding stability problems with recent NVdrivers after clean install (almost all of them). It seems they automatically enable the PHysX function. My experience has been that this causes folding instability, especially with systems running Multiple GPU2 console clients. Much of my erratic instability went away after disabling PhysX when folding. I only fold console clients so I cannot speak to any other. Just something to try :ewink: .

Also there has been debate about the speed gain in doing this (I don't run F@H monitors) but my gut feeling is that performance folding is also improved after disabling PHysX.

Please post back if you try this and it gives you any EUE/ stability relief.
shdbcamping
 
Posts: 519
Joined: Mon Nov 10, 2008 7:57 am

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby OldChap » Thu Jan 22, 2009 9:43 pm

Gentlemen, My previously solid folding machine has thrown a lot of EUE's on GPU today. I have installed extra cooling to reduce card temps by over 10deg C now. Would you like a list of the problem units now. or would you prefer to wait to see if I get any recurrances? How may I best help?
Image
OldChap
 
Posts: 66
Joined: Thu Jan 01, 2009 10:27 am

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby toTOW » Thu Jan 22, 2009 10:34 pm

If you solved your stability issues with additional cooling, you don't have to post here.

But if you keep failing WUs after that, a report will be welcome.
User avatar
toTOW
Site Moderator
 
Posts: 8454
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby heimie » Sat Jan 24, 2009 12:00 am

A new Project thats causing EUE's until 24hr pause for my System:


Project: 5768 (Run 8, Clone 101, Gen 29)
Project: 5765 (Run 9, Clone 486, Gen 2)
Project: 5766 (Run 11, Clone 373, Gen 32)
Project: 5767 (Run 14, Clone 86, Gen 31) Just now.(01/23/09-8+ hours Down)
Project: 5767 (Run 10, Clone 199, Gen 20) Right after cleaning out my folder....(01/23/09-10 minutes down)
Project: 5768 (Run 12, Clone 9, Gen 48)(01/24/097+ hours down)


Vista 64 bit
E8400 (Stock)
2x EVGA 9800GTX+ (Stock) 180.48 Drivers Phyzx Disabled. SLI disabled.
EVGA 680i
Last edited by heimie on Sat Jan 24, 2009 4:32 pm, edited 3 times in total.
heimie
 
Posts: 79
Joined: Sat Jun 14, 2008 10:17 am
Location: Lockport, Louisiana

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby skinnykid63 » Sat Jan 24, 2009 12:34 am

# Failing projects (please add a list of exact project numbers if you have them)
* 353 points Wus (project range : 5765-5772)

[23:58:28] Project: 5768 (Run 13, Clone 144, Gen 39)
[23:58:38] Project: 5768 (Run 13, Clone 144, Gen 39)
[23:58:59] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:09] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:22] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:33] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:43] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:53] Project: 5765 (Run 0, Clone 110, Gen 23)
[23:59:59] Project: 5767 (Run 8, Clone 41, Gen 38)
[00:00:16] Project: 5767 (Run 8, Clone 41, Gen 38)
[00:00:33] Project: 5767 (Run 8, Clone 41, Gen 38)

# Failing hardware (please add the exact GPU designation if you know it. ie 9800GTX+)

Gigabyte 8800GT running with -gpu 0

Less often I see EUE's on a second 880GT running as -gpu 1

# Failing OS

Windows 7 64 bit

# Failing drivers (enter here the version number of the driver you use)

Windows 7 driver from Nvidia, appears to be 179.23 according to dxdiag

...
# Comments (add below any detail you might find useful to the report)

GPU temps are well within spec at ~60C with room temp of ~19C-20C
This seems to be affecting my primary gpu more so than my secondary. They're identical.
skinnykid63
 
Posts: 23
Joined: Mon Jun 23, 2008 2:11 pm

Re: [Please read] NaNs detected on GPU - UNSTABLE_MACHINE error

Postby two00lbwaster » Sat Jan 24, 2009 11:25 am

Hmmmm, I have had better results on my first GPU if I don't put that GPU client into the startup folder, as this was the only GPU causing me issues.
Also, disabling Physx helped.
In addition, if I start getting Nans, and I notice, I now restart the client, during a successfully executing WU, before it can cause a stop for 24hrs.
two00lbwaster
 
Posts: 51
Joined: Sat May 24, 2008 9:48 pm

PreviousNext

Return to NVIDIA specific issues

Who is online

Users browsing this forum: No registered users and 3 guests

cron