returning to folding, problems with 680?

Post requests to add new GPUs to the official whitelist here.

Moderators: Site Moderators, FAHC Science Team

bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: returning to folding, problems with 680?

Post by bruce »

The text string containing the GF114 information is strictly for human convenience. FAH is only using the numbers in GPUs.txt to the left of that information so that, by itself doesn't cause a folding problem.

Obviously there's a possibility that those numbers are incorrect, the possibility of a bug in the drivers (even with WHQL certification), the possibility of a GROMACS bug that's only exposed by certain projects, the possibility of specific projects stressing the GPU more than other projects leading to overclock instability, etc. etc. Isolating which is the actual cause is virtually impossible until somebody reports back what they did to fix it with enough system details for others with similar systems to try the same thing and confirm that the same fix worked for them.

We have had a rash of drivers from both ATI and NV that were less than perfect yet were WHQL and the most common fix that has worked is to revert to older drivers (after running CCLEANTER to remove the vestiges of the newer driver. YMMV.

The FAH lab doesn't test all possible drivers on all possible GPUs on all possible versions of Windows (or WINE). The beta testing process exposes each project to a wider variety of combinations but there still will still be untested combinations, especially with new drivers. Advanced/advmethods testing exposes the projects to an even wider range of combinations, but there still may be inadequate testing on systems which EXACTLY match yours. That's a very good reason to avoiding updating to the latest drivers unless you really need to and a very good reason to avoid setting client-type to anything other than the default value.
Dark_n_Beyond
Posts: 9
Joined: Wed Nov 21, 2012 1:46 am

Re: returning to folding, problems with 680?

Post by Dark_n_Beyond »

Thanks for your quick reply, Bruce. That's good to know about the naming convention, so I can rule that out. As it is now, I've done a clean windows install, clean driver install with the drivers that were working for me up until a few weeks ago, no overclock, and I'm just plain out of ideas. I really think I'm just going to have to shut it down until I can find someone on any forum with a similar problem and a solution. Thing is, I feel really bad about what's happened with returning bad units. I'm willing to keep trying, but I don't know of any other way of testing different things without hurting the project.
mmonnin
Posts: 324
Joined: Wed Dec 05, 2007 1:27 am

Re: returning to folding, problems with 680?

Post by mmonnin »

FAH seems to push hardware more than benchmarks so what may seem stable in a benchmark mail fail a Work Unit. The unstable machine error is an indicator of an overclock gone too far or some other hardware problem.
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: returning to folding, problems with 680?

Post by codysluder »

mmonnin wrote:FAH seems to push hardware more than benchmarks so what may seem stable in a benchmark mail fail a Work Unit. The unstable machine error is an indicator of an overclock gone too far or some other hardware problem.
Absolutely. Do not assume that you can run F@H with an overclock that has been established based on some other program calling it stable. In fact, if F@H is just barely stable with one project that doesn't gurarantee it will be stable with another project.
ford316
Posts: 26
Joined: Sun Dec 30, 2012 7:44 am
Hardware configuration: Main Board: MSI 970A-G46 Processor: AMD - FX 8350 eight core OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9 "will overclock more later on"
Location: Earth

Re: returning to folding, problems with 680?

Post by ford316 »

codysluder wrote:
mmonnin wrote:FAH seems to push hardware more than benchmarks so what may seem stable in a benchmark mail fail a Work Unit. The unstable machine error is an indicator of an overclock gone too far or some other hardware problem.
Absolutely. Do not assume that you can run F@H with an overclock that has been established based on some other program calling it stable. In fact, if F@H is just barely stable with one project that doesn't guarantee it will be stable with another project.

This program IS my benchmark tool all because it does push far beyond what other programs do.. LOL :D Also I had a problem with my settings using auto volts in bios and windows kept freezing up after 8 hours or so so I changed the settings and upped the volts everything is perfect after the last 20 or so hours with no freezing. By far this would be the best benchmark tool and to check and see if computer is stable.
Board: MSI 970A-G46 Processor: AMD FX 8350 eightcore OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9
"will overclock more later"
Dark_n_Beyond
Posts: 9
Joined: Wed Nov 21, 2012 1:46 am

Re: returning to folding, problems with 680?

Post by Dark_n_Beyond »

mmonnin wrote:FAH seems to push hardware more than benchmarks so what may seem stable in a benchmark mail fail a Work Unit. The unstable machine error is an indicator of an overclock gone too far or some other hardware problem.
I agree with you that FAH probably pushes the hardware more than benchmarks. If you go back about 4 posts, you will see that I stated my folding problems are with NO overclock. I certainly wouldn't use a benchmark as an indication of stability, and unless I am actually benchmarking the card, I have no need for any overclock and don't.

Some other hardware problem is a possibility. I had updated the motherboard bios about the time this problem started, perhaps something changed not for the better? Should be easy enough to reflash if I can figure out what the original bios version was. Power supply is another possibility, and I'll be testing that tonight. Memory passes memtest fine, and I even tried a different set. Prime95 runs 24 hours without issue. Corrupted bios in the card is possible, but there are 3 seperate ones, and none work. Could be the card is actually bad, but no other test I can find has any kind of issue. I was getting nvlddmkm errors in the event logs with the newest driver, but since going back to 306.97 they have stopped. Card is watercooled, and maxes out at about 45C. Still get the following:

02:47:46:WU02:FS01:0x15:Run: exception thrown in GuardedRun -- cannot continue further.
02:47:46:WU02:FS01:0x15:Going to send back what have done -- stepsTotalG=40000000
02:47:46:WU02:FS01:0x15:Work fraction=0.2828 steps=40000000.
02:47:50:WU02:FS01:0x15:logfile size=19168 infoLength=19168 edr=0 trr=23
02:47:50:WU02:FS01:0x15:+ Opened results file
02:47:50:WU02:FS01:0x15:- Writing 19704 bytes of core data to disk...
02:47:50:WU02:FS01:0x15:Done: 19192 -> 5380 (compressed to 28.0 percent)
02:47:50:WU02:FS01:0x15: ... Done.
02:47:50:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
02:47:50:WU02:FS01:0x15:
02:47:50:WU02:FS01:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE
02:47:51:WARNING:WU02:FS01:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a)
02:47:51:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:7623 run:417 clone:4 gen:17 core:0x15 unit:0x00000013664f2dd14fe4facc3cec5a6d
02:47:51:WU02:FS01:Uploading 5.75KiB to 171.64.65.105
02:47:51:WU02:FS01:Connecting to 171.64.65.105:8080
02:47:51:WU02:FS01:Upload complete
02:47:51:WU02:FS01:Server responded WORK_ACK (400)
02:47:51:WU02:FS01:Cleaning up

I really didn't mean to hijack this thread, but if someone knows of some test I could run that would be comparable to folding, please let me know. Until I can figure this out, I've taken the GPU offline.
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: returning to folding, problems with 680?

Post by bollix47 »

For the CPU there's StressCPU v2. Also listed here.
ford316
Posts: 26
Joined: Sun Dec 30, 2012 7:44 am
Hardware configuration: Main Board: MSI 970A-G46 Processor: AMD - FX 8350 eight core OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9 "will overclock more later on"
Location: Earth

Re: returning to folding, problems with 680?

Post by ford316 »

Dark_n_Beyond wrote:
mmonnin wrote:FAH seems to push hardware more than benchmarks so what may seem stable in a benchmark mail fail a Work Unit. The unstable machine error is an indicator of an overclock gone too far or some other hardware problem.
Card is watercooled, and maxes out at about 45C
GPU card watercooled and temp is 45C... I don't have watercooled GPU and my temp is 45C for the GPU. Processor is watercooled and it runs about 38.8C on low fan setting. Dark you didn't make a backup of bios before updating? That is the first thing to do before updating bios. Look at the bios screen when computer starts and see what version you are running now then check motherboard home page and see if they have one that is older than what you have now. If not google search it. Good Luck.
Board: MSI 970A-G46 Processor: AMD FX 8350 eightcore OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9
"will overclock more later"
Dark_n_Beyond
Posts: 9
Joined: Wed Nov 21, 2012 1:46 am

Re: returning to folding, problems with 680?

Post by Dark_n_Beyond »

Yes, 45C. I can run it cooler, but I keep the fans turned down, because I don't like to listen to them. Got the bios flashed back to what the motheboard shipped with. For whatever reason, I started getting the feeling it may be the pcie slot, so I moved the card to another. Doing some searching, that's probably unlikely, but possible. PSU checks out with a multimeter, which I figured it would. Ran MemtestG80 a few times with no errors, although I haven't figured out how to run more than 50 iterations. I'd like to run many more. I'll load up the cpu stress test and let that run.
ford316
Posts: 26
Joined: Sun Dec 30, 2012 7:44 am
Hardware configuration: Main Board: MSI 970A-G46 Processor: AMD - FX 8350 eight core OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9 "will overclock more later on"
Location: Earth

Re: returning to folding, problems with 680?

Post by ford316 »

Nvidia just released 310.90 today for the 680 cards... it might be worth checking out or maybe not who knows until someone tries it and posts it... sent ya a pm dark try what I listed and post back here I will check back later on today or in the next few days. I am still working on the problem for ya so don't give up... also dark which version of windows are you running?
Board: MSI 970A-G46 Processor: AMD FX 8350 eightcore OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9
"will overclock more later"
Dark_n_Beyond
Posts: 9
Joined: Wed Nov 21, 2012 1:46 am

Re: returning to folding, problems with 680?

Post by Dark_n_Beyond »

ford316 wrote:Nvidia just released 310.90 today for the 680 cards... it might be worth checking out or maybe not who knows until someone tries it and posts it... sent ya a pm dark try what I listed and post back here I will check back later on today or in the next few days. I am still working on the problem for ya so don't give up... also dark which version of windows are you running?
I don't think new drivers are a good idea at this point, I'll stick with what worked for a couple of months without issue. Running Windows 7 Ultimate 64.

After "downgrading" the bios and moving the card to another slot, spent a few hours and ran all the tests I could think of, and found no issue. Crossed my fingers and prayed, added the gpu slot, and made it thru the first work unit (7626 109,0,53). That's further than I've got in 3 weeks, and project 7626 was one that I had problems with, so hopefully this is progress. Now running 7624 78,4,4 and so far so good.

Only other thing of note, and whether it's really significant or not I'm not sure (being based on limited projects), is that the gpu is now folding at 42279 ppd, versus right around 43.5k when I was having issues. I'm also not experiencing any of the video lag and choppiness I was before (and has been talked about at length in other places) while doing other tasks.
ford316
Posts: 26
Joined: Sun Dec 30, 2012 7:44 am
Hardware configuration: Main Board: MSI 970A-G46 Processor: AMD - FX 8350 eight core OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9 "will overclock more later on"
Location: Earth

Re: returning to folding, problems with 680?

Post by ford316 »

Dark_n_Beyond wrote:After "downgrading" the bios and moving the card to another slot, spent a few hours and ran all the tests I could think of, and found no issue. Crossed my fingers and prayed, added the gpu slot, and made it thru the first work unit (7626 109,0,53). That's further than I've got in 3 weeks, and project 7626 was one that I had problems with, so hopefully this is progress. Now running 7624 78,4,4 and so far so good.

Only other thing of note, and whether it's really significant or not I'm not sure (being based on limited projects), is that the gpu is now folding at 42279 ppd, versus right around 43.5k when I was having issues. I'm also not experiencing any of the video lag and choppiness I was before (and has been talked about at length in other places) while doing other tasks.

That reminds me a few weeks back when I did my upgrade of bios I had to go in and reset all the settings because everything went to default so I lost my overclock and how I wanted it to boot with many other settings. So it could be that the pci slot went bad, or bios upgrade changed things, or maybe some other program ya picked up along the way messed it up in 1 way or another. Atleast you got it working now. I am still researching on it since I work on and build computers its one of the little problems that is not fully solved yet and its a challenge. :lol:
Board: MSI 970A-G46 Processor: AMD FX 8350 eightcore OC 4.51GHz Ram: 16GB PC3-10700
GPU: Nvidia Geforce GTX 650 Ti 2GB GPU clock 1033MHz Memory clock 1350MHz Windows 7 ultimate x64 SP1 Build 7601 client v7.2.9
"will overclock more later"
Dark_n_Beyond
Posts: 9
Joined: Wed Nov 21, 2012 1:46 am

Re: returning to folding, problems with 680?

Post by Dark_n_Beyond »

I've made it through 4 work units as of now, without a single error:
7626 109,0,53
7624 78,4,4
7623 594,6,13
7623 48,6,16
The last 2 ran with the most current bios. My suspicion is something to do with the PCIE slot, whether it be bad, something in it, or what will take more investigation than I have time for at the moment. I never would have thought that if it wasn't for finding some posts from 2-3 years ago about changing PCIE slots (I think the reasoning was different then, but I didn't have anything to lose). I'll obviously keep a close eye on it, but unless things get all flaky again, I'll consider this solved.
Thanks bollix47 for pointing out the stress tests, and ford316 for your help!
Post Reply