Algorithm for distributing Work Units, statistical model

Moderators: Site Moderators, FAHC Science Team

Post Reply
wesley_olis
Posts: 1
Joined: Sat May 02, 2020 9:07 am

Algorithm for distributing Work Units, statistical model

Post by wesley_olis »

Hi,

I havn't gotten into much-published information on how work units are distributed however I have a few and ideas and possible improves

I would imagine that across all projects that have available work, you dish out work units.
Can we decided the priority and every there is nothing left if your preferences, then
pick the next best work unit from any other category, maybe this is an optional checkboxes feature.

Because waisting computational resource at your disposal slows progress obviously, especially if there work units in other major projects, that on is not supporting, but would like to determine there specific ordering of supported projects.

Imagion High has COVID-19 is top of the list.

Can some on also explain the progress of the project on the projects stats page.

Then I guess the other thing hopefully in work units, I don't know the details, is that machine that has six sigma reliability of returning and doing work, at highest computation rates are first assigned workloads of high priority, followed by those lower. Ideally, you looking to reduce and optimize the finishing time of all the workloads, you don't want 1 client to have a workload, that results in taking an additional 8 days to complete, so if faster clients can finish the workloads faster statistically, can work off a set of work units in the expected time of 4days processing 8 work units, then there is no point giving a work unit to someone that would take 8days, just means you going to be waisting computation and slowing down the rate at which the project is completed.

So to be clear WU rate is WU/ time in seconds metric.
Not clocks or anything else.

Guess have optimization algorithm, running the permutations, set of all clients and configurations, which then be simplified into common aggregations buckets of WU, sigma, work rate, work rate reliability, work rate min, max, avg, std. Then of all WU units available, assign them into the different work-rate aggregations' buckets, or assigned basically maximum slowest work-rate unit bucket, due to anything less than this just be wasting resource and creating additional waiting time as explained above. Every time new projects are added, then have to re-run this permutations algorithm, to update the distribution of the project of units, across work-rate aggregation buckets for assignment, when wu are requested. Basically match resources based on how reliable and efficient they are to complete the project in the least amount of time.

Obviously, you only updated the clients into add common aggregation buckets of WU, at a slower rate or incrementally as required promotion and de-promotion.

To take this further, each client for should report the typically maximum, avg, min power consumption of there chip average across the cores they were using, that is also then taken into account for the aggregation buckets. In which now you can compute to optimization dimensions max horsepower and measure the power consumption of the network, or have a balance between
max horsepower and power consumption, to where the curve flattens out, less CO2 emissions you saving the world.

To improve this further, ideally client on the machine, should be able to run the computations and then PWM idle cycles, or so to reduce power consumption, then measure system at different loads at the granularity of 1% which then also gets submitted and maintained by there client with re-benchmark/sample, which checks randomly from time to time as well.

Having this information about your client machines and the computation WU rate/ power consumption you can then basically find the sweet spot of balancing power and performance of the network.

In the long run, what this results in is naturally old slow and computation power inefficient clients, will no longer get work, as they can no longer make the baseline of WUrate/power consumption rate WU, which means just be doing the environmental damage.

If you would like to take this further, then that would be having the geographic location of the client and statistics as detail as you can get of there power source, in which cases as this information that would be obtained from national power operators changes for wind extra availability, then you can re-run section of the distribution of WU to aggregation WU clients, based on how clean there power is.

You could also associate each client with the geographic power rate in that country, which would allow you to in ways know which would be the cheapest to run, but balancing at between cheap and CO2.
You can then integrate this with OpenADR and PLMA Power utility demand response programs at a high level. In Which case this partially the start of everything being run remotely, however, to
a different set of contains, later everything will take topology of the grid network and period constant deadlines time availability, usage patterns, all going to be run in the cloud.
So would be possible first attempts, but at the pace things are going to have to move for EV and rest of distributed feedback network and both systems utility and distributed power

The other is for the client indicate that they are off-grid, and solar, in which case probably don't care as long its runoff solar permanently.

Eventually, this turns into a nice little AI problem.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Algorithm for distributing Work Units, statistical model

Post by PantherX »

Welcome to the F@H Forum wesley_olis,
wesley_olis wrote:...Can we decided the priority and every there is nothing left if your preferences, then
pick the next best work unit from any other category, maybe this is an optional checkboxes feature.

Because waisting computational resource at your disposal slows progress obviously, especially if there work units in other major projects, that on is not supporting, but would like to determine there specific ordering of supported projects...
With the latest version of the client (7.6.9), you can set your preference for supporting a project but if the WU isn't available for it, you will be assigned the next best one based on your hardware regardless of what your preference is. I guess your idea is to have a ranked system for the client to go through and select in a ordered list. It's a neat idea but that depends on the F@H Team to decide. Do note that F@H doesn't distribute "busy work" but strives to ensure that each WU is individual (with some exceptions) and that all the scientific work is important.
wesley_olis wrote:...Can some on also explain the progress of the project on the projects stats page...
I am not sure what you mean by "progress of the projects on the projects stats page" as those are a list of all active Projects (https://apps.foldingathome.org/psummary). A project doesn't have a "percentage completed" as they are generating statistical data which can run for days, weeks, months, and years to still produce new data for researches to analyze. It is the researcher who decides that there is enough data generated for their next step.
wesley_olis wrote:...Then I guess the other thing hopefully in work units, I don't know the details, is that machine that has six sigma reliability of returning and doing work, at highest computation rates are first assigned workloads of high priority, followed by those lower. Ideally, you looking to reduce and optimize the finishing time of all the workloads, you don't want 1 client to have a workload, that results in taking an additional 8 days to complete, so if faster clients can finish the workloads faster statistically, can work off a set of work units in the expected time of 4days processing 8 work units, then there is no point giving a work unit to someone that would take 8days, just means you going to be waisting computation and slowing down the rate at which the project is completed...
The researchers set the Timeout data and Expiration date. They are the people who are best at making the decision as to how long a Project is expected to run for. As long as a WU is able to be successfully folded before the Expiration date, it will be performing important scientific work. If a WU has reached the Timeout date, a copy will be sent out. When a WU has reached the expiration date, the client will automatically delete it. Generally speaking, there is a huge range of devices that F@H supports to allow a diverse and inclusive range of Donors.
wesley_olis wrote:...Guess have optimization algorithm, running the permutations, set of all clients and configurations, which then be simplified into common aggregations buckets of WU, sigma, work rate, work rate reliability, work rate min, max, avg, std. Then of all WU units available, assign them into the different work-rate aggregations' buckets, or assigned basically maximum slowest work-rate unit bucket, due to anything less than this just be wasting resource and creating additional waiting time as explained above. Every time new projects are added, then have to re-run this permutations algorithm, to update the distribution of the project of units, across work-rate aggregation buckets for assignment, when wu are requested. Basically match resources based on how reliable and efficient they are to complete the project in the least amount of time...
To take this further, each client for should report the typically maximum, avg, min power consumption of there chip average across the cores they were using, that is also then taken into account for the aggregation buckets. In which now you can compute to optimization dimensions max horsepower and measure the power consumption of the network, or have a balance between
max horsepower and power consumption, to where the curve flattens out, less CO2 emissions you saving the world....
Having this information about your client machines and the computation WU rate/ power consumption you can then basically find the sweet spot of balancing power and performance of the network.
In the long run, what this results in is naturally old slow and computation power inefficient clients, will no longer get work, as they can no longer make the baseline of WUrate/power consumption rate WU, which means just be doing the environmental damage...
You could also associate each client with the geographic power rate in that country, which would allow you to in ways know which would be the cheapest to run, but balancing at between cheap and CO2....
While it is a cool idea to make the most efficient usage of hardware, there are some limitations like:
1) The benchmark run will be meaningless if the hardware being used is non-dedicated
2) How will the benchmark take into account that I leave my system on for X hours a day where X changes and there's no pattern.
3) Different countries have different systems for electrical works and getting that information might not be easy. What happens if you can't get the information or it is incorrect?
4) Have you considered that I might be using old hardware purposely for personal heating reasons? Why run a heater when I have quad Fermi GPUs in my room folding?
5) Who will fund the extensive R&R for the initial setup and then the maintenance? Remember that F@H has massive resource (developer and costs) constraints.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply