Plotting Overall Folding System Efficiency

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
gordonbb
Posts: 510
Joined: Mon May 21, 2018 4:12 pm
Hardware configuration: Ubuntu 22.04.2 LTS; NVidia 525.60.11; 2 x 4070ti; 4070; 4060ti; 3x 3080; 3070ti; 3070
Location: Great White North

Re: Plotting Overall Folding System Efficiency

Post by gordonbb »

rwh202 wrote:
ProDigit wrote:This is only true when they're running from a PCIE 16x slot.
If they're running from a PCIE1x/4x slot, the power reading is incorrect, as it gets additional power from a riser.
In my case, both 1060s are using between 100-110W, deducted from a killawat meter; yet GPUZ reports 107W for one plugged in teh PCIE16x slot, and 63W for the one plugged in the PCIE1x slot.
I'd be surprised if risers affected anything (other than slowing the cards down and reducing power consumption)
The power sensors are on the card and don't care whether the power comes from the slot, the riser or 6/8-pin power connectors. However, I'm not sure if they are pre- or post-VRM and whether the losses there are included.
The difference between killawat and GPUZ is likely down to PSU efficiency.
107 + 63 = 170 W total for GPU. Assuming 85% PSU efficiency, then you have 200 W from the wall as shown by the killawat.
The power draw on the card includes the VRMs and the fans. I suspect that this is what the 0.005 Ohm shunt resistors that overclockers like to bridge are used for. Note that the overclockers typically only modify the shunt resistors for the External power connections and NOT the PCIe power leads as that would likely lead to melted traces and/or melted wires on the 24-pin motherboard connector and, in some cases, has been known to cause the main motherboard power connector to catch on fire.

Shunt Mods are not recommended in General and certainly not worth it for Folding as the gains would likely be minimal with the potential for killing or greatly diminishing the useful life of the card or motherboard.
Image
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Plotting Overall Folding System Efficiency

Post by toTOW »

I wonder why there are pikes in your curves ...maybe the power measurement happened just when a sanity check was done, which results in no GPU load, but also no change in PPD ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Plotting Overall Folding System Efficiency

Post by bruce »

toTOW wrote:I wonder why there are pikes in your curves ...maybe the power measurement happened just when a sanity check was done, which results in no GPU load, but also no change in PPD ...
In most cases, the sanity check uses both GPU resources and CPU resources, comparing the results. The power spike is probably when the CPU kicks into high-gear.
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Plotting Overall Folding System Efficiency

Post by toTOW »

Current sanity checks are not like we used to, they no longer use all CPU threads for a few seconds leaving the GPU sleep during this time. They just leave the GPU idle for a few seconds with no additional CPU load.

Here's an example I captured :
Image
And I can clearly hear the laptop fans slow down and spin faster again ...

If gordonbb captured data at this time, he would see normal PPD but very low power draw ... hence a spike in efficiency.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Plotting Overall Folding System Efficiency

Post by bruce »

Then how does the "reference platform" complete its calculations? {Presumably it replaces spin-wait CPU cycles with CPU Double Precision calculations which might or might not be observable with a change in CPU heating while the GPU is ldle. It also might look like the writing of a checkpoint.
gordonbb
Posts: 510
Joined: Mon May 21, 2018 4:12 pm
Hardware configuration: Ubuntu 22.04.2 LTS; NVidia 525.60.11; 2 x 4070ti; 4070; 4060ti; 3x 3080; 3070ti; 3070
Location: Great White North

Re: Plotting Overall Folding System Efficiency

Post by gordonbb »

toTOW wrote:I wonder why there are pikes in your curves ...maybe the power measurement happened just when a sanity check was done, which results in no GPU load, but also no change in PPD ...
I’m using SNMP to grab the output Power from the UPS and a call to FAHClient to read the PPD and though the interval is the same (1 minute) the two processes will invariably sample at different points within the window and, as Bruce noted, the GPU will drop in power if it’s being fed by the CPU.

I’ve tried to smooth things out by using a 3 minute sample average but there’s still spikes. For this metric though looking at the calculated average over at least a couple of days to smooth out variations in WUs I think will give the best indication off efficiency.

Once I have a better baseline I want to start lowering the power limits to see how The efficiency changes. I suspect due to the Quick Return Bonus though that once your under the top end knee that it may be fairly linear.
Image
gordonbb
Posts: 510
Joined: Mon May 21, 2018 4:12 pm
Hardware configuration: Ubuntu 22.04.2 LTS; NVidia 525.60.11; 2 x 4070ti; 4070; 4060ti; 3x 3080; 3070ti; 3070
Location: Great White North

Re: Plotting Overall Folding System Efficiency

Post by gordonbb »

toTOW wrote:Current sanity checks are not like we used to, they no longer use all CPU threads for a few seconds leaving the GPU sleep during this time. They just leave the GPU idle for a few seconds with no additional CPU load.

Here's an example I captured :
Image
And I can clearly hear the laptop fans slow down and spin faster again ...

If gordonbb captured data at this time, he would see normal PPD but very low power draw ... hence a spike in efficiency.
Interesting. I’m usually running nvidia-smi -q in a terminal window for each GPU and I’ve noticed these pauses where the GPU clocks drop every few minutes but I assumed these were the result of a new frame being transferred.
Image
ProDigit
Posts: 242
Joined: Sun Dec 09, 2018 10:23 pm

Re: Plotting Overall Folding System Efficiency

Post by ProDigit »

gordonbb wrote:
rwh202 wrote:
ProDigit wrote:This is only true when they're running from a PCIE 16x slot.
If they're running from a PCIE1x/4x slot, the power reading is incorrect, as it gets additional power from a riser.
In my case, both 1060s are using between 100-110W, deducted from a killawat meter; yet GPUZ reports 107W for one plugged in teh PCIE16x slot, and 63W for the one plugged in the PCIE1x slot.
I'd be surprised if risers affected anything (other than slowing the cards down and reducing power consumption)
The power sensors are on the card and don't care whether the power comes from the slot, the riser or 6/8-pin power connectors. However, I'm not sure if they are pre- or post-VRM and whether the losses there are included.
The difference between killawat and GPUZ is likely down to PSU efficiency.
107 + 63 = 170 W total for GPU. Assuming 85% PSU efficiency, then you have 200 W from the wall as shown by the killawat.
The power draw on the card includes the VRMs and the fans. I suspect that this is what the 0.005 Ohm shunt resistors that overclockers like to bridge are used for. Note that the overclockers typically only modify the shunt resistors for the External power connections and NOT the PCIe power leads as that would likely lead to melted traces and/or melted wires on the 24-pin motherboard connector and, in some cases, has been known to cause the main motherboard power connector to catch on fire.

Shunt Mods are not recommended in General and certainly not worth it for Folding as the gains would likely be minimal with the potential for killing or greatly diminishing the useful life of the card or motherboard.
I found out that the pcie riser cards, only provide 35W to the GPU, while the remaining power comes from the connector on the top.
In one of the cards, the readout was low, because the riser was plugged in the pcie voltage rail, and card was tapped off a Sata port.

I realized when I switched the connectors, the power consumption went up to 80Watts, and performance increased.
ProDigit
Posts: 242
Joined: Sun Dec 09, 2018 10:23 pm

Re: Plotting Overall Folding System Efficiency

Post by ProDigit »

The dips in power draw are less than 1 second on my system, and usually decrease in time (width) with lower RAM overclock.
I doubt they're responsible for the efficiency spikes.
I think PPD in FAHcontrol updates every so many percents of a percent. The spikes might indicate times when fah updates estimated PPD, because it just finished processing 1% of a WU, or something...
I mean, it's another possible explanation for the spikes.
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Plotting Overall Folding System Efficiency

Post by toTOW »

bruce wrote:Then how does the "reference platform" complete its calculations? {Presumably it replaces spin-wait CPU cycles with CPU Double Precision calculations which might or might not be observable with a change in CPU heating while the GPU is ldle. It also might look like the writing of a checkpoint.
Good question ... I think that since we moved to use mixed precision calculations, the portion of sanity check made on the CPU has been reduced : either it is partly done on the GPU using double precision or the amount of checks needed have been reduced.

I don't know enough of OpenMM code to conclude.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Plotting Overall Folding System Efficiency

Post by foldy »

@ProDigit: Sata only delivers 54 watts, PCIe or Molex can do over 100 watts. Some say it has more risk to run riser from SATA compared to Molex because it burns more easily. So they use a Molex to 6pin connector where 6pin goes into riser and Molex connects to power supply Molex.
https://i.imgur.com/Xg2wvF1.png
Post Reply