Too big units, comparing to just finished ones

Moderators: Site Moderators, FAHC Science Team

Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Too big units, comparing to just finished ones

Post by Joe_H »

Trokari wrote:How about being able to dynamically distribute the required simulations into smaller packages? I don't know at all if it's possible to implement in FAH's case, tho. It would be nice if you could choose the WU size you prefer from the client's UI
and the work server would then be able to split the calculations into requested size and go from there.

There could be certain, fixed work unit sizes that folders would be able to choose from. This wouldn't cause problem with the WU time sensitivity either and actually it would probably improve on getting calculations finished in
time as fewer work units would go to waste due to expiration at least for intermittent folders like myself.
It is not currently possible to dynamically distribute the WUs into smaller sizes. Each WU is from a specific trajectory - the Project, Run and Clone numbers - run for so many steps from the starting point of that trajectory. The Gen 0 WU starts from the beginning and when returned the next Gen 1 WU is created using the final status of Gen 0 as the starting. The same atoms will be in the simulation, so the size can't be changed there, and the final results for each PRC trajectory depend on piecing together Gen 0 through Gen n. That is easily done with a fixed number of steps, generating various different number of steps would require major changes to the software code and the analysis code used non the results.

Something similar was tried some years ago. This was referred to as a "streaming core". It would assign work, when work was stopped results would be sent back and the WU be reassigned to someone else to continue from that point. They ran into a number of problems and eventually abandoned this approach.

There have been some other things done in the past to provide smaller assignments such as the NaCl one for Google Chrome or the one developed by Sony in collaboration with F@h for their Android phones. Both of those were active for a few years, but both ended when Google and Sony ended their support for the necessary technical items.

Going forward, F@h has limitations on just how much software can be developed. For years they had just a single full time developer, a second was recently hired to work on developing the next version of the client.

Now you did not provide any details on what you are running F@h on in either of your posts. With more detail we could make suggestions that may get you a more appropriate mix of WUs, with more being likely to finish in the time you are providing.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

A related topic which will become increasingly important because of increasing electricity costs, is the inability to manage costs by scheduling Folding on/off by time of day. In 2020 I racked up 115 M points from my collection of 10 HP Z-series workstations, but 90% were generated by a single GTX 1060 (6GB) in a Z600 so I shut everything else down. After running the GPU 24/7 for most of the year I realised it was going to be a very expensive process to replace a GPU amidst the crypto craze, so I shut it down too. Lately (Mat 2022) the crypto crash has brought prices down, so I have picked up a RTX 2060, a GTX 1070, and three GTX1080s on eBay, and I am using the big old boxes as well-ventilated platforms with big power supplies to run GPUS only. They are de-tuned to a target temperature of 70 degrees, and collectively generate about 4.5M PPD when run 24/7, using about 100W each (over and above the basic cost of powering the box they are in). But the monthly power bill has already gone up a fair bit, and I am not looking forward to the next one. My power plan charges very different rates for peak / shoulder/ off-peak usage. Obviously I can walk around pausing and shutting them down. But an app to let me set up "off peak" running would help me keep folding, and I assume others would benefit from this also.
Image
PaulTV
Posts: 179
Joined: Mon Jan 25, 2021 4:53 pm
Location: Netherlands

Re: Too big units, comparing to just finished ones

Post by PaulTV »

You could schedule fahclient --send-pause, fahclient --send-finish and fahclient --send-unpause from the system scheduler
Image

Ryzen 5800X / RTX 4090 / Windows 11
Ryzen 5600X / RTX 3070 Ti / Ubuntu 20.04
Ryzen 5600 / RTX 3060 Ti / Windows 11
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Argh! This is the third attempt at this reply. The first one must have taken too long to write, since it seems I was logged out. After I finished the second the mouse fell off the table with the same result. So this will be very quick. Thanks PaulTV, but I have never encountered system scheduler in my travels so I'd need more information. However, I need to do more than pause FAHclient to save power charges. Research today with a power meter for total device consumption and HWInfo for the GPU consumption (when folding with my "Stay cool" settings) shows that my old Z600s and Z800s consume about 180W when (W10) is sleeping with FAHclient paused, and everything awake and GPUs folding only adds 100-120W. So I have to pause and shut down. Thus I can choose to donate 6.2M PPD for 11AUD per day if I run 24/7, or 2.3M PPD for 2.0 AUD per day if I only run 9/7 off peak. These are devices that spend most of the lives in retirement doing nothing, so I will switch them on at off peak times to do a WU now and then, and abandon my unaffordable aspirations for team appepi to get above its latest position of #218 for June. As for the CPU's, even though they can run seriously large STATA16 jobs in less than an hour, as against 25 minutes for the latest devices, CPU folding costs me 33 AUD per million points, so that is OUT.
Image
aetch
Posts: 447
Joined: Thu Jun 25, 2020 3:04 pm
Location: Between chair and keyboard

Re: Too big units, comparing to just finished ones

Post by aetch »

It's called the "Task Scheduler" and it's part of Windows.
Start -> Windows Administrative Tools -> Task Scheduler

I'm not sure if you can run commands directly from the scheduler but I do know you can run batch files and powershell scripts from it.
Folding Rigs - None (25-Jun-2022)

ImageImage
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Thanks aetch. I think the last time I wrote a .bat file was in 1992 for a 286. Unless you count autocoder on an IBM 1401 around 1970? These days it had better be clickable or Googleable or there's a Youtube video it's above my pay grade. As far as I can tell, the task scheduler seems to be OK for scheduling things within a box that's running W10, but it looks like I'm going to have to go with the "one intelligent digit" solution for actually shutting down the power. It will work out OK. The "intelligent" bit deals with the tricky programming decisions, and the digit addresses the switching problem. The penalty function for burning watts outside the "off peak" zone is wide at the "switching off" end (0700-1400) and relatively gentle ( at 1.5x the off peak tariff). It's really only the 3x rate during the peak from 1400-2000 that I have to avoid, and I've set an alarm to tell me that switching on time is 2200 hours.

For example, it's 4 am now, so there's 3 hours of "off peak" time to go. So now I check out the production line. In this room I have Z602/AUSU Turbo 1070, Z803/MSI Aero-Dell version 1080 and Z805/ASUS Turbo 1080. They are helping the gas heater to hold the room around 20 deg C during the current burst of weather straight from Antarctica. The decision in all cases is to let them run, for review when I get up. They are either running jobs that will end before 0700, or will extend not too far beyond.

Now I have to go upstairs to the frigid regions that are heated only by GPUs. Z603/1060 is quietly burning 280W and only offering about 0.55 M PPD for it. This Asus turbo 1060 (6GB) ran 24/7 through most of 2020 doing folding, so maybe it's a bit tired or maybe it is just that this this particular job is underpaid : it usually foes about 0.8 M PPD.

Z441 has a Dell 1080, is turning out 1.3M PPD. a bit higher than Z803 at the same settings (possibly PCI3 vs PCIE2, DDR4 versus DDR3, Xeon E5-1650 v3 versus Xeon 5620) and has a job that will finish in half an hour. Z442 has the RTX 2060, does 1.9 M PPD, and had 4 hours to run. The decision is the same.

In summary, thanks to the buffer zone provided by the "shoulder" tariffs, my switch on/ switch off decisions are fairly simple and non-critical. it s really only if one for them picks up a job just before I should switch it off, and then the best solution is to dump it and let someone else have it, rather than delay it. And now I can check if I got that signature thing right.
Image
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: Too big units, comparing to just finished ones

Post by gunnarre »

Due to the time critical nature of Folding@Home - (PPD drops the longer you pause a WU and it eventually times out and expires) - and the unpredictable processing times (each project has a different running time on your hardware, which might even vary a bit between WUs), Folding@Home doesn't lend itself so well to temperature dependent regulation. Folding@Home should be more like the baseload heating, and you'd use regular thermostat-controlled heating to shave the peaks and valleys in temperature needs. Preferably, you shouldn't send "send-pause" commands, but "send-finish" commands to the client to regulate time-of-folding.

The newest LAR Systems dark mode Chrome extension for Folding@Home has a rudimentary time-of-day setting which you can use from the browser, instead of setting up scripts in the OS. This time-of-day setting issues a "start" and "finish" command at the specified time, but if the WU is long-running, then it might never actually pause; it's more tuned to finish shorter-running WUs.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Thanks Gunnare. Since I'm retired, and the current "team" of 5 workstations whose GPUs do the folding have no other routine demand on them, it is no big deal to switch on 5 boxes at 10pm when "off peak" starts, and see where they've each gotten to at 7am (when "off peak" turns into "shoulder" rates) or whenever I get up. Because they are all running only GPU jobs, it is rare for a WU to have an ETA more than a few hours later. "Peak" doesn't start on my Electricity plan until 2pm, so any ETA less than 7 hours at 7am can go to "Finish up then stop" status, and I might add a sticker or set an alarm with the hour I need to check and shut down the box to avoid idling charges. This means they will run at "shoulder" rates that are about 50% more expensive, but only for a while, and will (a) not delay dependent processing and (b) use minwatts and (c) collect maxpoints. The cute thing about this has been that I also switch off screens and lights and so on, and the net effect of the Folding-related attention to detail is that most days the savings relative to past careless consumption outweigh the cost of Folding! Thus over the last fortnight the 38 Mpoints have cost AUD 15 cents per M (USD or EUR 10 cents). Plus some extra wear and tear on switches maybe, set against the exercise benefits of climbing stairs.

More generally, since FAHControl is calculating an ETA, then freestanding boxes not subject to other concurrent use and with only a single slot have predictable finish times-of-day for the current job. This could easily be combined with time of day rate schedules to issue "Finish up and Stop" requests in FahControl at a suitable time (say 30 mins earlier to prevent another WU download). This at least avoids being grit in the wheels of progress. But I guess my "use case" is a small minority and my workaround helps to pass the time.
Image
Foxter
Posts: 26
Joined: Sat Apr 18, 2020 2:45 pm

Re: Too big units, comparing to just finished ones

Post by Foxter »

@appepi

I am part of a team that has more that 10 Nvidia folding graphic cards.

What I noticed is that older generation GTX 1xxxx and RTX 20x0 (but not RTX 2080 TI) can fold at the minimum Power limit available that can be set in MSI Afterburner.

For a 1660 TI I got around 100.000.000 points at around 75W (43% MSI Afterburner power limit)

For RTX 2060/2060 Super/2070 /2070 Super and 2080/2080 Super I set the power limit to minimum and got around 125W power consumption.

However for RTX 2080 TI I needed to leave it to at least 70% power limit, otherwise the PPD dropped a lot. As a general rule for all generations, the top cards x080/x090 need at least 70-75 power limit the rest can in some cases work at the minimum power limit or around 50-60 power limit.

You should try to play a bit with MSI Afterburner you might get similar results for your graphic card. Beside the lower power bill, you will also see lower heat and extended graphics card life. In more than two years of almost 24/7 folding, my team only lost a graphics card (however, we lost 2 power supplies, a EVGA G3 and a BitFenix Whisper).
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Thanks Foxter, you have backed up and formalised some vague observations I made as I was tuning my cards down to run around 70 deg C. I use ASUS GPU Tweak II since all but one (MSI Aero 1080 made for Dell) are ASUS Turbos in order to fit the HP workstation cases. Just did a more formal test on that 1080. My base setting is to set temp to 70Deg C and link to Power setting, observe, then de-link power and adjust separately if temp exceeds 70. This happened to generate 62% power setting on the box I tested and it was running around 120W according to HWInfo. I tried the standard GPU Tweak "O/C" setting (temp 83, Power 110%), which sent temp up to 83 and watts to around 145, and yielded only 3% improvement in Time per Fold. Conversely minpower around 60% still generated temp 68 , watts around 90, and Time per fold was 4% worse. Clearly TPF and PPD on a 1080 aren't strongly linked to power above 60%.

Since my Z800s and the Z440s only have 2 x 75W PCIe6 connectors, while the Z600s have only a single PCIe6 (but 650W power supplies and you can steal another 75W from the SATA power) I am limited to cards with 8 pin PCIe max, so the RTX 2060 is top of the line for me, anyway. With the current temp settings and manual management to keep most of the folding during "off peak" times, my last 80 Mpoints only cost a total of AUD $2.40 over previous careless usage. As an incidental observation, my smartmeter has told my electricity supplier that 31% of my consumption over the last month of folding was because I was running a pool pump, and they helpfully suggested that I could de-tune it a bit. AI is a wonderful thing.
Image
prcowley
Posts: 23
Joined: Thu Jan 03, 2019 11:03 pm
Hardware configuration: Op Sys: Linux Ubuntu Studio 21.04 LTS
Kernal: 5.11.0-37-lowlatency
Proc: AMD Ryzen 7 1700 - 8-core
Mem: 32 GB
GPU: Nvidia GeForce GTX 1080Ti
Storage: 2 TB
Location: Gisborne, New Zealand
Contact:

Re: Too big units, comparing to just finished ones

Post by prcowley »

Hopefully I an not off base with this .....

There is a setting that can be added to the client under the Expert tab:
Name: max-packet-size
Value: small (or normal or big)

Although this option appears to be more about the size of the file downloaded, If you set this to small it might also mean the work unit is fairly small in size too which could help in your completion times/

Worth a try and I hope it works for you.
Pete Cowley, Gisborne, New Zealand. The first city to see the light of the new day. :D
Image
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Thanks prcowley, if anyone is off-base it is me since the base of this thread was the problem you address: the "next cab off the rank" allocation of WUs leading to the picking up giant jobs that are going to be a nuisance for one's hardware and manner of using it, and/or for the Project that wants the WU done ASAP. Your suggestion directly addresses that more or less, depending on the relationship between download size and run-time, whereas my own input was about a subset of that problem - such jobs are not a problem for my hardware so long as I limit it to the GPU's when the devices are not needed at all, but as a self-funded retiree they are a problem for my wallet if the WU extends into peak periods where the electricity may cost 4x the off-peak rate per kWh. This is compounded if one is running CPU's as well as GPUs because the same size of job will generally take much longer on the CPUs than the GPUs and the chances of them both ending at a convenient time are negligible, and your suggestion might also help with that by using different settings for CPU and GPU slots.

In my own use case/ digital ecosystem, with a mothball flotilla of elderly but respectable workstations living in my Residential Aged Computer Facility with only a limited number of tasks to do these days, I can give them some cheap healthy activity by limiting Folding to 1 (power-limited) GPU per device, switching on at the start of "off peak time at 10 PM, and having a slightly relaxed policy of letting jobs continue to completion as long as this is within the 1.5x rate of "shoulder" time ending at 2PM next day. With this scheme I have only had to dump one WU and delay another during July, and don't have to get up in the small hours to check. This has increased my electricity bill by a mere AUD $0.06 per million points or AUD $8 for the last 128 Million points, relative to my previous careless use of electricity in the first 2 weeks of May before I resumed Folding. The three Z800s also run LSI-9260-8i cards with BBU's for various RAID arrays of HDDs or SSDs, and this helps to keep the batteries in the BBUs charged and lets Megaraid Storage Manager do its patrol reads on the 6 x 4TB HDDs in RAID 10 in Z805 (my backup device) without me having to think about it.
Image
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Team "appepi" annual report.

I have now been following the recipe in previous posts for 52 weeks, aiming to fold in "off peak" electricity rate times (9 hrs/day from 2200 to 0700 next day, while also finishing in "shoulder times" at about 1.5x the "off peak" if need be. Also, once team "appepi" reached the top 1K at the end of March I cut back to 3x GTX 10xx GPU's most of the time, occasionally topped up with one or more of the three RTX 2060s.

The overall results are 1,212.8 million points per year that added 19% in electricity cost to my pre-folding (careless) consumption, or $0.38 (AUD) per million points on average. As of July, we are being hit with a 20% increase in electricity charges, so the next 12 months will be much quieter.
Image
Lazvon
Posts: 105
Joined: Wed Jan 05, 2022 1:06 am
Hardware configuration: 4080 / 12700F, 3090Ti/12900KS, 3090/12900K, 3090/10940X, 3080Ti/12700K, 3080Ti/9900X, 3080Ti/9900X

Re: Too big units, comparing to just finished ones

Post by Lazvon »

My power bill went from around $800-900/month (USD) to $1400 this last month (usage only up about 15% as summer starts). We had large rate increases and new special fees for the next decade or so to cover capital improvements. We also opt for 100% solar generation which we get to pay a 10% “privilege” for. Ha! Love my state’s politics.

Folding probably accounts for <10% of total power during summer months, and <15% during cooler months. Heat pump hardly runs in basement though. And gas furnace probably a bit less on first floor even. :)
Folding since Feb 2021. 1) 4090/12900KS, 2) 4080/12700F, 3) 4070Ti/9900X, 4) 3090/12900K, 5) 3090/10940X, 6) 3080Ti/12700K, 7) 3080Ti/9900X

Image
appepi
Posts: 25
Joined: Wed Mar 18, 2020 2:55 pm
Hardware configuration: HP Z600 (5) HP Z800 (3) HP Z440 (3)
ASUS Turbo GTX 1060, 1070, 1080, RTX 2060 (3)
Dell GTX 1080
Location: Sydney Australia

Re: Too big units, comparing to just finished ones

Post by appepi »

Lazvon wrote: Sun Jun 18, 2023 11:32 am
Folding probably accounts for <10% of total power during summer months, and <15% during cooler months. Heat pump hardly runs in basement though. And gas furnace probably a bit less on first floor even. :)
As luck would have it, my electricity supplier recently decided to shut down the beautiful 4x500 MVA coal-fired power station I helped to build 50 years ago, which alongside other closures changed the cost structures Vs time-of-day dynamics quite a lot. Back in the days when we produced artisanal electricity in the traditional way by destroying the planet and making hundred of tons of hot metal spin very fast, it was cost-effective to maximise the appallingly low efficiency of the overall process by keeping the beasts running steadily and manage the peaks by drawing on hydro, etc, which led to a cheap "off peak" rate overnight when other demands were low. But nowadays when electricity production in Australia is turning into a cottage-roof industry, the relation to daylight is quite different, and was reflected in my recent tariff changes. While the "Peak" rate only increased by the widely-announced 20 per cent or so [from roughly 50 to 60 cents/ kWh], the "off peak" rates jumped by 63% to 24 cents/kWh and and the "shoulder" rates by 43% to 31 cents/kWh. This new tech is better for the planet of course, but the collateral damage is that Team appepi has had to economise, and learn to tolerate having its fingers trodden on by teams sprinting past us up the ladder to which we cling desperately at the rate of 1-2 Million PPD.

Incidentally, how can your supplier provide "100% solar" electricity, since as far as Wikipedia can inform me, the non-Alaska continental US extends only from the West Quoddy Head lighthouse in Maine at 66 degrees 57 minutes west, and Cape Alava in Washington at 124 degrees 44 minutes west. Thus, unlike the former British Empire, the sun DOES actually set on the US for a significant period each day. And, while these new-fangled batteries can no doubt help at these times, how can the supplier guarantee that all the electrons in them were responsibly sourced from only solar sources, without contributions to the grid from fossils, wind, water, or (shudder) TMI2-generation nuclear? Which reminds me that the report of the President's Commission on the TMI2 reactor incident declared that the root cause of the problem was failing to include the cost of adequate safety in the rate-setting process for electricity charges. So really, we are slowly learning, I guess.
Image
Post Reply