Core 17 and temp control

Moderators: Site Moderators, FAHC Science Team

Post Reply
ChristianVirtual
Posts: 1596
Joined: Tue May 28, 2013 12:14 pm
Location: Tokyo

Core 17 and temp control

Post by ChristianVirtual »

First at all: congratulation to the team to release core 17 ; a truly remarkable milestone as it enable more computing power to the project. Very well done.

(http://folding.stanford.edu/home/change ... -full-fah/)


Now in the release notes I read about temp control and wonder if there is also a way to actually read the current temperature of the GPU via remote interface. I really would love to be able to have that as a datapoint in slot info and be able to present in all kind of front ends. As the core reads the value and have ones some cool-down-control build in I think the effort to add actual temp would be marginal. But offer a valuable information to end user.
ImageImage
Please contribute your logs to http://ppd.fahmm.net
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Core 17 and temp control

Post by 7im »

I see it as progress, but some might say just the opposite. With core 11, 15, and 16 going end of life, there are lots of pre-fermi gpus that will eventually go idle, or on to other projects. A lot of older AMDs already have. Not sure how many of those are still around, but it's a counter weight to rebalance against the core 17 announcement.

Start with: nvidia-smi.exe

But all the OC tools for GPUs already display the temp in their tool tray, and have built in alerts, and fan ramping, etc. So for fah, it's a duplication of effort, and a bell/whistler not really needed. But if you're in to building such things, check out NV's tools. Note the announcement about temps is for NV only. Not sure how AMD does it. You'd have to add code for each.

IMO, the temp code in fah for NV is a last line of defense, not a front line tool. I want my fans ramping up way before I consider shutting down the GPU via FAH. I also want downclocking before idling the GPU for a minimum of 15 minutes. And it only works in single GPU systems at this time, so us dual GPU users, who have more heat concerns, can't use it.

It's one more wrench in an already well stocked tool bucket. ;) Some may find it useful, others not.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
ChristianVirtual
Posts: 1596
Joined: Tue May 28, 2013 12:14 pm
Location: Tokyo

Re: Core 17 and temp control

Post by ChristianVirtual »

Sure I use nvidia-smi on the console with refresh loop every other second and also via my zabbix monitoring tool for graphing over the day. Works ok.

But as the core has the value anyway it would streamline generalized monitoring tools allowing to get a quick glance whenever looking after the folding progress. The effort putting in into the slot info should be fairly small. But add a temp graph over the TPF bars in my iPad would be interesting.
I would not go so far to control the fan via remote interface; that could cause trouble/damage if wrong used and keep that to the OC tools for the specialists/geeks.

Didn't understood the single GPU part of the news! too bad. Me even triple GPU folder; talk about heat. But get cold outside so happy folding :mrgreen:
ImageImage
Please contribute your logs to http://ppd.fahmm.net
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Core 17 and temp control

Post by 7im »

Even a small effort is a waste when it's a duplicated effort. Nice to have, not must have. And we have a lot of must have bugs to fix first.

It also has a very limited application value.
Only 1 GPU.
Must be NV GPU.
Doesn't work in all 3 OS types.

Tools they put in to the GUI should work for all types of GPUs, for multiple GPUs, and for all OS types. Until they can do that, it's a waste of time anyway. Someone should add this as a feature request ticket, but it's a long list.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Core 17 and temp control

Post by art_l_j_PlanetAMD64 »

7im wrote:Even a small effort is a waste when it's a duplicated effort. Nice to have, not must have. And we have a lot of must have bugs to fix first.

It also has a very limited application value.
Only 1 GPU.
Must be NV GPU.
Doesn't work in all 3 OS types.

Tools they put in to the GUI should work for all types of GPUs, for multiple GPUs, and for all OS types. Until they can do that, it's a waste of time anyway. Someone should add this as a feature request ticket, but it's a long list.
Very well said!

I don't really understand the 'want' (it's not a 'need') to have so much data being monitored remotely, especially for the GPU core temperatures.

For the 'temperature control' of the 46 GPUs in "The Farm", I rely on 3 things:
  1. Each GPU's own built-in (by NVIDIA and/or the GPU maker) 'clock frequency control', which does the job for me. You can see "The Farm" at this link.
  2. EVGA Precision X (which works with all makes of NVIDIA GPUs), where I set each GPU's 'Fan Speed' control to 'Manual', and then set the Fan Speed to get the GPU temps I want (65C maximum). This usually ends up with a Manual Fan Speed of anywhere from 80% to 95%. In my experience, over many years with NVIDIA GPU types from 9500GT's to GTX Titan's, I have found that the 'Auto' fan speed control, is universally poor, regardless of the 'make' of NVIDIA GPU.
  3. Good 'internal' and 'external' airflow control is essential, as is described at this link.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
ChristianVirtual
Posts: 1596
Joined: Tue May 28, 2013 12:14 pm
Location: Tokyo

Re: Core 17 and temp control

Post by ChristianVirtual »

On the other side this temp control function made it already into the core ... working for only NV card; even single GPU setups only. :?: the work is 90% done. Just asking to provide this collected value to the outside world.

And as for monitoring of temps ... There are less professional donors and setups out there. I would be one of those.
I use manual fan control set Linux and try to get my three GPU on less then 70C; but I have less control about ambient temps during the day. So it would be still great (yes, a "want") to get access to a data point already collected.
ImageImage
Please contribute your logs to http://ppd.fahmm.net
Post Reply