Linux install guide

FAH provides a V7 client installer for Debian / Mint / Ubuntu / RedHat / CentOS / Fedora. Installation on other distros may or may not be easy but if you can offer help to others, they would appreciate it.

Moderators: Site Moderators, FAHC Science Team

ipkh
Posts: 175
Joined: Thu Jul 16, 2015 2:03 pm

Re: Linux install guide

Post by ipkh »

I run 2 gpus per system.
I have no knowledge of multigpu overclocking on linux other than messing with cool bits. GreenwithEnvy doesn't do more than 1 gpu.
Ultimately overclocking and fan control on GPUs is not time or cost effective. Overclocking too much can cause WU failures and increasing clocks dramatically to make a quantifiable difference to ppd risks WU failure and increased power draw out of proportion to the benefit.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Linux install guide

Post by MeeLee »

ipkh wrote:I run 2 gpus per system.
I have no knowledge of multigpu overclocking on linux other than messing with cool bits. GreenwithEnvy doesn't do more than 1 gpu.
Ultimately overclocking and fan control on GPUs is not time or cost effective. Overclocking too much can cause WU failures and increasing clocks dramatically to make a quantifiable difference to ppd risks WU failure and increased power draw out of proportion to the benefit.
That's definitely not the case with Nvidia GPUs on Linux.
1- Overclocking doesn't affect power consumption.
2- Coolbits enable you to run more power efficient, dropping several tens of watts
3- Fan curve control to max, often boosts PPD by several 100k PPD vs running the stock fan curve.

There are ONLY benefits from correctly using Coolbits.
The only way you can break WUs is by using too high of an overclock.
All my RTX GPUs all do +100Mhz on GPU (most do +120Mhz), and +1400Mhz on the RAM.
And I haven't had a single bad WU in well over 6 months of running (as a result of a bad overclock).
Well, when I do push the curve, or the temps outside are past 90F, I occasionally hit a bad WU, but I'm already familiar with what is possible, and apply a much milder OC nowadays. (+115Mhz only triggers bad WUs on my system when the weather is unusually hot outside, like over 100F).
Hopfgeist
Posts: 71
Joined: Thu Jul 09, 2020 12:07 pm
Hardware configuration: Dell T420, 2x Xeon E5-2470 v2, NetBSD 10, SunFire X2270 M2, 2x Xeon X5675, NetBSD 9; various other Linux/NetBSD PCs, Macs and virtual servers.
Location: Germany

Re: Linux install guide

Post by Hopfgeist »

MeeLee wrote: 1- Overclocking doesn't affect power consumption.
Yes it does. It's simple physics. It's also why it is more energy-efficient on half-loaded system to have them run full time on half clock speed rather than half-time on full clockspeed and the other half-time stopped. It's the raison d'être for SpeedStep and PowerNow and all their descendents. Since the basic physics is the same for GPU, all that is true for them as well, see for example this paper.
There are ONLY benefits from correctly using Coolbits.
[...]
And I haven't had a single bad WU in well over 6 months of running (as a result of a bad overclock).
I see: all your bad work units are the result of a good overclock.
I occasionally hit a bad WU, but I'm already familiar with what is possible, and apply a much milder OC nowadays. (+115Mhz only triggers bad WUs on my system when the weather is unusually hot outside, like over 100F).
So you are basically saying "I don't get bad work units, unless I do."

Cheers,
HG.
Image
Dell PowerEdge T420: 2x Xeon E5-2470 v2
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Linux install guide

Post by MeeLee »

Hopfgeist wrote:
MeeLee wrote: 1- Overclocking doesn't affect power consumption.
Yes it does. It's simple physics. It's also why it is more energy-efficient on half-loaded system to have them run full time on half clock speed rather than half-time on full clockspeed and the other half-time stopped. It's the raison d'être for SpeedStep and PowerNow and all their descendents. Since the basic physics is the same for GPU, all that is true for them as well, see for example this paper.
There are ONLY benefits from correctly using Coolbits.
[...]
And I haven't had a single bad WU in well over 6 months of running (as a result of a bad overclock).
I see: all your bad work units are the result of a good overclock.
I occasionally hit a bad WU, but I'm already familiar with what is possible, and apply a much milder OC nowadays. (+115Mhz only triggers bad WUs on my system when the weather is unusually hot outside, like over 100F).
So you are basically saying "I don't get bad work units, unless I do."

Cheers,
HG.
I don't think you know what you're talking about.
Nvidia gpus don't add a single watt to overclocking.
The 'overclocking procedure' is still limited by the power target set by the manufacturer, or by the user. Only if you move the power target will the gpu use more power.
As for the rest of your post, im not sure what to make of it....
Hopfgeist
Posts: 71
Joined: Thu Jul 09, 2020 12:07 pm
Hardware configuration: Dell T420, 2x Xeon E5-2470 v2, NetBSD 10, SunFire X2270 M2, 2x Xeon X5675, NetBSD 9; various other Linux/NetBSD PCs, Macs and virtual servers.
Location: Germany

Re: Linux install guide

Post by Hopfgeist »

MeeLee wrote: Nvidia gpus don't add a single watt to overclocking.
Yes they do. Otherwise nvidia would break the laws of physics.

What consumes power in modern semiconductor circuits are state transitions. So the more transitions you have per second, the more power you consume. In addition, to drive higher clockrates reliably you need higher voltages, which increases the power required for each transition. When always running at the highest stable clockrate for a given voltage, all other things being equal, and you want to increase it further, the transition-based part of power consumption rises roughly with the third power of frequency: once for the number transitions per second, once for the voltage and once for the current, which rises roughly linear with voltage.

There are hundreds of research papers (at least dozens of them freely available) explaining all this to anyone with a basic grasp of physics.

Now when NVidia specifies the design power consumption higher than the card needs in "normal" operation, so that it does not need to be up-rated for mild overclocking, that might give the impression that it does not need more power when running it at higher frequency, but that's mostly to do with having a sufficient safety margin in operation and also occasionally increasing clock frequency even in non-overclocked operation, when temperature and power margins allow this.

TL;DR: when go faster, need more watts!
As for the rest of your post, im not sure what to make of it....
Well, can't help you there.


Cheers,
HG
Image
Dell PowerEdge T420: 2x Xeon E5-2470 v2
HaloJones
Posts: 920
Joined: Thu Jul 24, 2008 10:16 am

Re: Linux install guide

Post by HaloJones »

If overclocking doesn't increase the power usage, does underclocking also not reduce power usage? GPUBoost adds voltage automatically as it boosts up to the limit specified in its BIOS and this along with a faster frequency makes a huge difference to its power consumption. Add an overclock setting and that frequency is increased which draws more power. I genuinely don't see how this is in question...

Coolbits-induced overclocks increase your WU thoughput and so long as it is a truly stable overclock it is of benefit but at the cost of more power consumption. One of the problems with the FAH work is that it is not static with different work units inducing different stress levels. What was stable for one core and one work unit may not work for all cores and all work units. FWIW, I overclock too with water-cooled cards. My failure rate is a fraction of 1% and of course it's near to impossible to determine if those failures were down to my clocks or cards or alien interference.
single 1070

Image
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: Linux install guide

Post by Ichbin3 »

Practice beats theory ;- )

I have a powerlimit of 220W at my 2080Ti.
Now I tried an offset of +60 MHz for several WUs.
The power consumtion did not change - because of the limit.
The frequency went higher.
But the output in points per WU did also not change, I compared the same projects 17420 und 17421 before and after.
The only difference I can make is by increasing the power limit.
Mysterious ...
Last edited by Ichbin3 on Thu Nov 05, 2020 1:16 pm, edited 1 time in total.
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Hopfgeist
Posts: 71
Joined: Thu Jul 09, 2020 12:07 pm
Hardware configuration: Dell T420, 2x Xeon E5-2470 v2, NetBSD 10, SunFire X2270 M2, 2x Xeon X5675, NetBSD 9; various other Linux/NetBSD PCs, Macs and virtual servers.
Location: Germany

Re: Linux install guide

Post by Hopfgeist »

Ichbin3 wrote:Practice beats theory ;- )

I have a powerlimit of 220W to my 2080Ti.
Now I tried an offset of +60 MHz for several WUs.
The power consumtion did not change - because of the limit.
The frequenzy went higher.
But the output in points per WU did also not change, I compared the same projects 17420 und 17421 before and after.
Mysterious ...
Not at all mysterious. All modern GPUs have internal boost and throttling algorithms. If you set a power limit, the card will throttle back to stay within limits, even if the nominal frequency is higher. Conversely, if "boost" (or whatever it is called) is enabled, it will occasionally increase the frequency for short periods of time if more power is available and temperatures are within limits. Most modern CPUs also do this.

If you get almost exactly the same power and the same performance even if nominally "overclocked", it just means its power and boost management is pretty smart. Which goes back to my original point, if you don't give it more power (watts), a higher nominal clock rate won't give you more performance (PPD), and might even give you less, depending on workload and power management algorithms.

HG.
Image
Dell PowerEdge T420: 2x Xeon E5-2470 v2
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: Linux install guide

Post by Ichbin3 »

Ichbin3 wrote:If you get almost exactly the same power and the same performance even if nominally "overclocked", it just means its power and boost management is pretty smart. Which goes back to my original point, if you don't give it more power (watts), a higher nominal clock rate won't give you more performance (PPD), and might even give you less, depending on workload and power management algorithms.
For sure it supports your point.
Ichbin3 wrote:Mysterious ...
I had an ironically smile which you couldn't see on my face when I wrote this.
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Hopfgeist
Posts: 71
Joined: Thu Jul 09, 2020 12:07 pm
Hardware configuration: Dell T420, 2x Xeon E5-2470 v2, NetBSD 10, SunFire X2270 M2, 2x Xeon X5675, NetBSD 9; various other Linux/NetBSD PCs, Macs and virtual servers.
Location: Germany

Re: Linux install guide

Post by Hopfgeist »

Ichbin3 wrote: I had an ironically smile which you couldn't see on my face when I wrote this.
Sorry for missing that. I suspected it, but emotion is hard to catch in the written word.

:wink: <- This would have helped to make it more obvious.
Image
Dell PowerEdge T420: 2x Xeon E5-2470 v2
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Linux install guide

Post by MeeLee »

If you overclock, you basically use the tolerance Nvidia has built in.
It's the same as running stock frequency, but undervolt; running the same speed at a lower wattage.

If you cap the power on GPUs, the GPU frequency automatically lowers.
If you lower the GPU max frequency, it will also lower the power usage.
There's a tolerance Nvidia built in of about 100Mhz (some GPUs have more, some less), where the GPU is just wasting watts, and you can either overclock with those watts, or limit the power on GPUs for the same frequency.
Nvidia GPUs aren't like Intel or AMD CPUs.

Practice proves all those theories indeed incorrect.
Hopfgeist
Posts: 71
Joined: Thu Jul 09, 2020 12:07 pm
Hardware configuration: Dell T420, 2x Xeon E5-2470 v2, NetBSD 10, SunFire X2270 M2, 2x Xeon X5675, NetBSD 9; various other Linux/NetBSD PCs, Macs and virtual servers.
Location: Germany

Re: Linux install guide

Post by Hopfgeist »

MeeLee wrote: There's a tolerance Nvidia built in of about 100Mhz (some GPUs have more, some less), where the GPU is just wasting watts, and you can either overclock with those watts, or limit the power on GPUs for the same frequency.
I suggest taking some lessons in physics and semiconductor circuit operation before embarrassing yourself some more.

The GPU is not wasting power. It is simply not using all the margin available. If you can increase the frequency without running into the limit, then it was operating below the limit before the change.
Nvidia GPUs aren't like Intel or AMD CPUs.
:roll: So the laws of physics don't apply to NVidia?
Practice proves all those theories indeed incorrect.
Have you measured the power consumption? Then explain your technique and show the results. Otherwise this whole point is moot.

I think you don't understand the difference between a power cap (what you adjust in the settings) and the power actually consumed. :e?:

And they're not all those theories, they are the laws of nature, and the science that is the basis for engineering these marvellous devices.

And then there's the marketing department, implying you can get more performance without using more power. You can't unless you change the architecture, or run at undervoltage, risking data corruption.


HG.
Image
Dell PowerEdge T420: 2x Xeon E5-2470 v2
Post Reply