Experimental core to reduce overheating for large protein WU

Moderators: slegrand, Site Moderators, PandeGroup

Experimental core to reduce overheating for large protein WU

Postby friedrim » Tue Jun 16, 2009 9:26 pm

A new, experimental core, FahCore_11.exe, has been posted at http://www.stanford.edu/~friedrim This core allows you to specify the percentage of time the program will idle by setting the environment variable 'FAH_GPU_IDLE'. For example, if FAH_GPU_IDLE=10, the program will idle for approximately 10% of the time. Idle times greater than 0 should reduce problems associated with boards overheating for work units modeling larger proteins. On the downside, large idle time percentages will reduce ppd.

If you wish to test this core, stop your GPU client, rename FahCore_11.exe to something else (for a backup) and download the core into the same directory before restarting the client. This version also includes a fix that changes the way checkpoints are done. You can expect the current WU to restart from the beginning. Install the new core soon AFTER a WU finishes. (The same is true if you decide to switch back to the standard core.)

You can set environment variables via Control Panel -> System properties->Advanced->Environment Variables or
right click on 'My computer' and choose Properties -> Advanced->Environment Variables

Environment variables are case-sensitive so be sure to specify FAH_GPU_IDLE=10 because fah_gpu_idle=10 will be ignored.
User avatar
friedrim
Pande Group Member
 
Posts: 93
Joined: Wed Apr 02, 2008 5:25 pm

Re: Experimental core to reduce overheating for large protein WU

Postby MichaelB » Tue Jun 16, 2009 11:02 pm

Great News no third party software needed to control the heat generated in Core 11 WU's Thanks a bunch.
MichaelB
 
Posts: 64
Joined: Sun Dec 02, 2007 6:08 am

Re: Experimental core to reduce overheating for large protein WU

Postby shatteredsilicon » Tue Jun 16, 2009 11:12 pm

Great - so now we can over-idle over-clocked hardware to stop it from overheating. Empirical proof that enough whining can achieve anything?
Image
1x Q6600 @ 3.2GHz, 4GB DDR3-1333
1x Phenom X4 9950 @ 2.6GHz, 4GB DDR2-1066
3x GeForce 9800GX2
1x GeForce 8800GT
CentOS 5 x86-64, WINE 1.x with CUDA wrappers
shatteredsilicon
 
Posts: 699
Joined: Tue Jul 08, 2008 2:27 pm

Re: Experimental core to reduce overheating for large protein WU

Postby ZBoater » Wed Jun 17, 2009 1:49 am

Hi, I just tried the new experimental core. Forgive the noobness, but the instructions should read "If you with to test this core, stop your GPU client after its done with its current assignment, rename FahCore...". If not, you get this:

[01:29:28] Completed 95%
[01:31:40] Completed 96%
[01:33:53] Completed 97% :roll:

Folding@Home Client Shutdown.

(after restart)

[01:35:03] fcCheckPointResume: file hashes different -- aborting.
[01:35:03] mdrun_gpu returned
[01:35:03] Checkpoint failure. :eo
[01:35:03] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[01:35:03] DynamicWrapper: Frame files deleted for proj=work/wudata_08 -- requesting restart from client

And from what I can tell, it is running hot, as usual... :mrgreen:
ZBoater
 
Posts: 4
Joined: Wed Jun 17, 2009 1:36 am

Re: Experimental core to reduce overheating for large protein WU

Postby shdbcamping » Wed Jun 17, 2009 4:29 am

@friedrim,
Thanks for the fast work. I will play with it in the AM and PM you with any issues and/or feedback. I'm at work on 3rd shift or I'd be on it now :wink: .
Sean
shdbcamping
 
Posts: 587
Joined: Mon Nov 10, 2008 7:57 am

Re: Experimental core to reduce overheating for large protein WU

Postby Leonardo » Wed Jun 17, 2009 6:55 am

Alright, I must now also chime in.
Try folding multiple 9800GX2s in a single enclosure! I am very experienced in modifying computer cases and advanced air cooling, including custom duct work, but still, those darn GPUs get HOTT with 57XX projects, especially now that summer is here. I run a total of 12 GPUs in Folding, some of which are highly heat sensitive, some of which are not. (not all GPUs, even of the same specification, are created equal) Some of the GPUs complete projects perfectly at overclocked settings, but some must be underclocked to reliably process 57XX units.

In my opinion, the "environmental variables" option was an excellent solution for what many have been clamoring.
User avatar
Leonardo
 
Posts: 600
Joined: Tue Dec 04, 2007 5:09 am
Location: Eagle River, Alaska

Re: Experimental core to reduce overheating for large protein WU

Postby shatteredsilicon » Wed Jun 17, 2009 7:03 am

Leonardo wrote:Try folding multiple 9800GX2s in a single enclosure!


I did for about 6 months. Two 9800GX2s was just fine, closed case. Three gets too hot, but gets back under control with the case side removed. Three wouldn't be a problem at all with one of those cases that have a 220mm fan in the side, but that's not what I have.
shatteredsilicon
 
Posts: 699
Joined: Tue Jul 08, 2008 2:27 pm

Re: Experimental core to reduce overheating for large protein WU

Postby 7im » Wed Jun 17, 2009 7:08 am

ZBoater wrote:...
And from what I can tell, it is running hot, as usual... :mrgreen:



Agreed. PG should edit the first post to correct that mistake. You almost never want to change fahcores in the middle of a work unit.

Also, environment variables are case sensitive. Try in all caps. And then, if still hot, increase the idle to 20%. Let us know what happens.
Please do not mistake my brevity as dispassion or condescension. I recognize the time you spend reading the forum is time you could use elsewhere, so my short responses save you time. Please do not hesitate to ask for clarification if I was too terse.
User avatar
7im
 
Posts: 13323
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Experimental core to reduce overheating for large protein WU

Postby shdbcamping » Wed Jun 17, 2009 7:48 am

shatteredsilicon wrote:
Leonardo wrote:Try folding multiple 9800GX2s in a single enclosure!


I did for about 6 months. Two 9800GX2s was just fine, closed case. Three gets too hot, but gets back under control with the case side removed. Three wouldn't be a problem at all with one of those cases that have a 220mm fan in the side, but that's not what I have.

As I have posted before, I have a HAF932 case with 3X GX2's (and the Big Side Fan) in it and the WU's are still too hot.
shdbcamping
 
Posts: 587
Joined: Mon Nov 10, 2008 7:57 am

Re: Experimental core to reduce overheating for large protein WU

Postby bollix47 » Wed Jun 17, 2009 8:47 am

FYI

My 8800GT @ stock running via wine in Ubuntu was getting temperatures in the hign 90s(97 before trying this but have seen it over 100). I tried setting the variable to 10 but the temperature only dropped about 1 degree, so I increased it to 50. Temperatures are now in the low 90s. This resulted in an expected drop in PPD of around 400 on a 511 point WU(from ~3000 to ~2600). Haven't seen the effect on other core 11 WUs but suspect the drop will be a bit more as all other WUs have been running 1000-2000 more PPD than the 511 WUs.

Also, I too had the WU reset it's progress to 0% after restarting with the new core. Fortunately, the progress was low when I made the change so not too much time was wasted. I have since restarted the client for other reasons and the WU carried on normally from where it left off before the restart. :wink:
bollix47
Site Moderator
 
Posts: 2819
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Experimental core to reduce overheating for large protein WU

Postby susato » Wed Jun 17, 2009 12:27 pm

Interesting that a 50% setback is dropping the PPD by only 15%. Thanks for reporting that this works with GPU-on-WINE.
susato
Site Moderator
 
Posts: 950
Joined: Fri Nov 30, 2007 4:57 am
Location: Team MacOSX

Re: Experimental core to reduce overheating for large protein WU

Postby shdbcamping » Wed Jun 17, 2009 1:47 pm

bollix47 wrote:FYI

My 8800GT @ stock running via wine in Ubuntu was getting temperatures in the hign 90s(97 before trying this but have seen it over 100). I tried setting the variable to 10 but the temperature only dropped about 1 degree, so I increased it to 50. Temperatures are now in the low 90s. This resulted in an expected drop in PPD of around 400 on a 511 point WU(from ~3000 to ~2600). Haven't seen the effect on other core 11 WUs but suspect the drop will be a bit more as all other WUs have been running 1000-2000 more PPD than the 511 WUs.

Also, I too had the WU reset it's progress to 0% after restarting with the new core. Fortunately, the progress was low when I made the change so not too much time was wasted. I have since restarted the client for other reasons and the WU carried on normally from where it left off before the restart. :wink:

Getting geared up for the Beta now myself. If your results are typical with the 511 point wu's, it'll be better than I hoped. And better than any other alternative that I've been able to find :D . Keep up the testing and PM fried with any feedback.
shdbcamping
 
Posts: 587
Joined: Mon Nov 10, 2008 7:57 am

Re: Experimental core to reduce overheating for large protein WU

Postby 7im » Wed Jun 17, 2009 1:49 pm

bollix47 wrote:FYI

Also, I too had the WU reset it's progress to 0% after restarting with the new core. Fortunately, the progress was low when I made the change so not too much time was wasted. I have since restarted the client for other reasons and the WU carried on normally from where it left off before the restart. :wink:


I asked the staff to fix the opening post.
User avatar
7im
 
Posts: 13323
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Experimental core to reduce overheating for large protein WU

Postby planetclown » Wed Jun 17, 2009 2:33 pm

Say you use this experimental core with the setting FAH_GPU_IDLE = 10.

Isn't this the same as setting the "CPU Usage Percent" slider to 90% in the GPU Systray client, or "CPU usage requested" to 90 in the console version.

Or in other words, in the GPU clients won't the CPU usage variable affect the GPU instead of the CPU?

I remember trying this out awhile back and it seemed to work that way.
Image
User avatar
planetclown
 
Posts: 51
Joined: Wed Feb 25, 2009 11:54 am

Re: Experimental core to reduce overheating for large protein WU

Postby bruce » Wed Jun 17, 2009 2:47 pm

planetclown wrote:Isn't this the same as setting the "CPU Usage Percent" slider to 90% in the GPU Systray client, or "CPU usage requested" to 90 in the console version.


They are similar, but not the same. A 10% reduction in CPU utilization will not result in a 10% reduction in GPU utilization. The relationship is NOT linear and there are good reasons to believe that reducing CPU utilization may result in different values of GPU utilization, depending on the size of the protein.

This has not been thoroughly tested. I encourage you to try both methods or a combination of the two and report your findings. I'm sure that the donors will find the best ways to meet the needs of their individual systems. Testing of the experimental core is expected to provide a much better understanding of such differences than can be determined in the lab.
bruce
Site Admin
 
Posts: 16867
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Next

Return to NVIDIA specific issues

Who is online

Users browsing this forum: No registered users and 1 guest