2 GTX295's installed, UNSTABLE_MACHINE issues

Moderators: slegrand, Site Moderators, PandeGroup

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby shdbcamping » Tue Feb 10, 2009 12:41 pm

jaak ennuste wrote:
p2501 wrote:Get that Enermax 85+ 1250 PSU, I doubt that you'll need a second PSU with that. That is.. what Phenom are you going to run with it exactly?


When I started planning my 4 gpu rig, I took official data - 289 watts per each GTX 295 GPU and 300 watts per system board = 1456 watts. As I coudn't source PS more than 1200 (FSP lifestyle), I went 2 psu.

In real life I run now 3 x GTX 295 system at 850 watts, I see possible to add 1 more GTX 295 an run it from 1200W PS.Most probably final load is somewhere 1050 - 1100 Watts. But my 2 PS case was ordered already...

Jaak

There's always AUX VGA PSU's. they mount in the 5 1/4 inch front bays. Thermaltake has a 450 and 650W models. 650W has 2 6Pin and 2 8Pin PCIe power outs. I'm running one to power 2 GX2's in an XPS720. The system PSU runs the 3rd GX2.
Image Image

Folding whatever I can since Sept '08
shdbcamping
 
Posts: 585
Joined: Mon Nov 10, 2008 8:57 am

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby theRFMan » Tue Feb 17, 2009 4:06 pm

Hello,

I'm new to the Folding Forums, but not new at all to hardware troubleshooting and driver issues. I also have 2xGTX295 in my system, and I did get them to fold on 4 cores in Vista x64.

Hardware details:

EVGA 780i motherboard
Q9650 @ 4.1GHz on air (OCZ Vendetta 2 cooler)
OCZ EliteXstream 1000W PS
2x ASUS GTX295 (will run up to 680 clock and 1180 mem, but at stock for folding in non-SLI)
1 x Zotac 8800GT 1GB (For PhysX, can't get it to fold at all)

The key to getting this to work in realizing that in Vista every video card must have a monitor connected to it (and desktop extended) in order to be usable for CUDA. In the case of the GTX295, it's really two video cards (because SLI is off), one of which has two DVI ports, the other having only an HDMI port. So in order to get all the GPUs folding, it is essential to have something connected to those HDMI ports. This may be possible using a dummy load, I actually connected my two monitors to the HDMI ports using DVI-HDMI cables that I had hanging around.

With only two HDMI monitors connected, I booted up, and was quite surprised to see that I actually had access to three of the four GPUs. It seems that the DVI side of the first GTX295 is always active whether or not a monitor is connected to it. Good. I added a dummy plug to a DVI port on the second GTX295, and I got all four cores recognized. Yay. I found a good way to see which GPUs were CUDA-accessible was to install Badaboom encoder (trial version). In the advanced options, it lists which GPUs it can see along with GPU number.

That being said, I'm experiencing a few issues that some of you have probably also encountered, perhaps there are solutions out there:

1) I cannot get the 8800GT to fold, it always send back an UNSTABLE_MACHINE just as it starts processing, whether it has a dummy plug or actual monitor connected to it. This happens regardless of whether the GTX 295s are idle or not. I've read that the G92-class GPU will not co-exist with the GT200-class GPUs. There may not be a way to fix this short of a F@H core update or nvidia driver change. If there is an easy fix, good, otherwise I can accept not using the 8800GT for folding, since the primary use of the system is for gaming, not dedicated folding.

2) With all 4 GTX295 GPUs working away, I notice a drop in GPU processing speed when I start the SMP CPU client on the Q9650. All four GPUs were progressing at 48s per % point consistently, and in some cases I've seen three cores go down to 1min and another to over 2min. It seems like the CPU can't feed the GPUs with enough work while working on it's on WU at the same time. Has anyone seen something like this happen? What's the solution... increasing the priority of the GPU clients to higher than idle should do the trick, but it's the the ideal way of doing things I guess.

Thanks!

Luc
theRFMan
 
Posts: 3
Joined: Tue Feb 17, 2009 2:51 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby jaak ennuste » Tue Feb 17, 2009 8:45 pm

Luc the RFman

You may be beginner, but had made great progress here, getting all GPU-s folding under Vista.

About degrading performance, when SMP running, please check Configure -> Advanced -> Core Priority settings in GPU-s. The have to be "slightly higher".

Jaak
building 32 GPU folding rig: 16 x NVIDIA GeForce GTX 295 cards; dual PSU solution; 4 nodes.
Website: Estonia Donates, ambitious 400 PPD supercomputer project
Sponsored by Dell sülearvutid and SayAgain audio bookmarks
User avatar
jaak ennuste
 
Posts: 346
Joined: Thu Jan 08, 2009 12:30 pm
Location: Tallinn, Estonia, EU

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Rambler358 » Wed Feb 18, 2009 1:17 am

theRFMan wrote:With only two HDMI monitors connected, I booted up, and was quite surprised to see that I actually had access to three of the four GPUs. It seems that the DVI side of the first GTX295 is always active whether or not a monitor is connected to it. Good. I added a dummy plug to a DVI port on the second GTX295, and I got all four cores recognized.

Thanks for the report. Now, if us Vista x64 users who don't have HDMI monitors can get 4 GPU clients running on our 295s then we'll be happy. :wink:
Image
Rambler358
 
Posts: 63
Joined: Fri Sep 19, 2008 5:40 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby theRFMan » Wed Feb 18, 2009 2:16 am

Well, HDMI monitors aren't the best solution either, because the video card does not send any POSt or boot video through HDMI, so it's a blank screen until the Vista logon appears. This is completely unacceptable for overclockers, so I'm going to try with HDMI dummy plugs instead of HDMI monitors.

Otherwise I'm dual-booting XP for folding and Vista64 for Quad-SLI gaming and photo work.

Luc
theRFMan
 
Posts: 3
Joined: Tue Feb 17, 2009 2:51 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Governator » Wed Feb 25, 2009 8:55 am

jfarque wrote:I'm glad I checked, but yes I do have SLI turned off in the only way I know how (the nVidia Control Panel). PhysX and SLI are both disabled. I was hoping there was a super-secret way to disable the internal SLI behavior that I was unaware of.

Here's the query from WMI Explorer, at least the first bit of it. I did not see anything very exciting about the output.

Image

Below is the output from GPU-Z. Its output has one interesting feature and that is that the BIOS of the GPUs is different between the odd and even GPUs. The first GPU selectable in the GPU dropdown has BIOS ending in .90, the second GPU .92, the third GPU .90, and the fourth GPU .92. Going to pop a card out and see if one 295 has two different BIOS versions on it or not.

Image

Image

jaf
I'm curious, why are you disabling PhysX?? That will achieve nothing. As far as I know, the internal SLI everyone keeps referring to is the display performance modes under 3D properties:

Image
Governator
 
Posts: 49
Joined: Tue Feb 17, 2009 6:58 pm
Location: Rapid City, SD

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Governator » Wed Feb 25, 2009 9:00 am

jaak ennuste wrote:Luc the RFman

You may be beginner, but had made great progress here, getting all GPU-s folding under Vista.

About degrading performance, when SMP running, please check Configure -> Advanced -> Core Priority settings in GPU-s. The have to be "slightly higher".

Jaak
Actually that doesn't do much if at all if your system isn't fully optimized. Rather you should tweak the affinities for GPU and CPU, assigning them to different cores.
Governator
 
Posts: 49
Joined: Tue Feb 17, 2009 6:58 pm
Location: Rapid City, SD

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby theRFMan » Wed Feb 25, 2009 2:49 pm

Governator wrote:
Actually that doesn't do much if at all if your system isn't fully optimized. Rather you should tweak the affinities for GPU and CPU, assigning them to different cores.[/quote]

Well, with the SMP client fully utilizing all four cores, I'm not sure how I can optimize further than ensuring that the GPU clients get priority over the SMP CPU client (since they are more efficient).
theRFMan
 
Posts: 3
Joined: Tue Feb 17, 2009 2:51 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Atlas Folder » Wed Feb 25, 2009 4:30 pm

HDMI connections isn't a fix for Vista it's a workaround. The bug in the Vista driver remains but I'm glad that some with GTX295s are able to get their systems working in Vista.

Jason
User avatar
Atlas Folder
 
Posts: 115
Joined: Sun Feb 01, 2009 11:59 pm
Location: Broken Arrow, Oklahoma

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby CPUacreage » Thu Feb 26, 2009 5:57 am

I have 2x GTX295's headed my way. I can use XP SP3, XP x64, Vista 32-bit or 64-bit for an operating system. It is primarily going to be a folding machine, no gaming anyway, but light email and the like. So which operating system should I use?

Denny, aka CPUacreage
CPUacreage
 
Posts: 71
Joined: Sun Dec 02, 2007 8:58 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Atlas Folder » Thu Feb 26, 2009 6:24 am

Vista x32, Vista x64 and Windows 7 Beta currently have a bug in their CUDA implementation that prevents them from (reliably) detecting the second GPU in each card. A very few people have gotten it to work with 3 or 4 GPUs, but most have not. nVidia is aware of the problem and will have a fix out one day. Knowing that the decision is in your hands.

Jason
User avatar
Atlas Folder
 
Posts: 115
Joined: Sun Feb 01, 2009 11:59 pm
Location: Broken Arrow, Oklahoma

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Rambler358 » Thu Feb 26, 2009 6:26 am

If it's strictly a folding machine for the 295s, then go with XP to get 4 GPU clients running on it.
Image
Rambler358
 
Posts: 63
Joined: Fri Sep 19, 2008 5:40 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Governator » Thu Feb 26, 2009 7:19 pm

Well I now have all 4 gpus running on XP but it's not totally without issues, though most of the time it's error free for gpu's 1-3. Now I seem to get these quiet often on my main chip gpu0:

[16:52:55] NANs detected on GPU
[16:52:55]
[16:52:55] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:52:59] CoreStatus = 7A (122)
[16:52:59] Sending work to server
[16:52:59] Project: 5770 (Run 13, Clone 70, Gen 240)
[16:52:59] - Read packet limit of 540015616... Set to 524286976.
[16:52:59] - Error: Could not get length of results file work/wuresults_02.dat
[16:52:59] - Error: Could not read unit 02 file. Removing from queue.
.
.
.
[16:53:58] mdrun_gpu returned
[16:53:58] Self-test failure
[16:53:58]
[16:53:58] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:54:01] CoreStatus = 7A (122)
[16:54:01] Sending work to server
[16:54:01] Project: 5756 (Run 12, Clone 168, Gen 125)
[16:54:01] - Read packet limit of 540015616... Set to 524286976.
[16:54:01] - Error: Could not get length of results file work/wuresults_05.dat
[16:54:01] - Error: Could not read unit 05 file. Removing from queue.
[16:54:01] EUE limit exceeded. Pausing 24 hours.

I'm guessing that much of this is heat related, and must say I'll have my AquaGrafx blocks in about 2 weeks all set up so hopefully I'll rid my machine of much of this BS.
Governator
 
Posts: 49
Joined: Tue Feb 17, 2009 6:58 pm
Location: Rapid City, SD

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby Rambler358 » Thu Feb 26, 2009 7:31 pm

What temps were your 295's GPUs getting under folding? If they're higher than 85C you might run into problems.
Image
Rambler358
 
Posts: 63
Joined: Fri Sep 19, 2008 5:40 pm

Re: 2 GTX295's installed, UNSTABLE_MACHINE issues

Postby CPUacreage » Sun Mar 01, 2009 3:12 am

Vista x32, Vista x64 and Windows 7 Beta currently have a bug in their CUDA implementation that prevents them from (reliably) detecting the second GPU in each card. A very few people have gotten it to work with 3 or 4 GPUs, but most have not. nVidia is aware of the problem and will have a fix out one day. Knowing that the decision is in your hands.


After some careful thought I'll be putting the two 295s is separate folding boxes. I'll combine my 2 9800 GTX+ into one box and RMA an overheating Radeon 4850 to make room for the 295s. Part of my reasoning is the power supply issues; part OS related.

For now I will have :
700W XP Q6600 with 2xHD3780s
750W XP64 E6600 with GTX295
750W Vista 32 Q6600 with 2x9800 GTX+
650W Vista 32 Xeon 3060 [=E6600] with GTX295
650W XP E6600 with HD3780
650W XP64 E6600 7600GT [no GPU folding]

Motley, eh? I have retired all GPU1 graphics cards (3x19xx) and one HD2900. :?

I still have room for growth within the confines of the six boxes with PCIe slots. No need nor money for Jaak's or Jason's setups, but I may get bit by the power supplies yet, but that would be a minor cost by comparison.

Denny, folding with the misnomer, CPUacreage. I don't have a farm, but rather a small acreage of GPUs.
CPUacreage
 
Posts: 71
Joined: Sun Dec 02, 2007 8:58 pm

PreviousNext

Return to NVIDIA specific issues

Who is online

Users browsing this forum: MSN [Bot]