Thoughts on optimizing the assignment process

ajm · Post by **ajm** » Mon Mar 22, 2021 7:23 am

FAH doesn't need benchmarks of my system - except at the very beginning, it already has real-world detailed folding results. It just needs to use them when assigning WUs. The computing (of the results) could be done locally and grouped with the standard "System" values that FAH is using now for assigning WUs. This would be updated for example after each ACK message. Thus a system that is working very regularly would "enjoy" optimized assignement rate and selection of WUs, hence more points, thus encouraging users to have very steady systems.
For the users that don't bother, there would be more random assignements, thus less than optimized results (and points).
EDIT: And "steady" doesn't have to be 24/7. If a user announces that his system(s) will be available, say from 9pm to 7am from Monday to Friday and 24/7 on Saturday and Sunday, the assignments would be optimized for those periods, so that the last one ends shortly before the user's deadline, and the new one starts right on time.

Power users should have the option to designate swapping spaces, on their system/network or not (some people could offer them for the community), where their checkpoints would be systematically copied (and erased as soon as one or two new ones are in and/or after the corresponding ACK message). In case of crash or emergency, the initiated WUs would be transferred to another adequate machine (locally or not) and the WS or CS would be informed of the change by the involved instances of FAH (but wouldn't have to handle the data itself). This would save quite a bunch of WUs.

For the trajectories, we need something working almost in real-time that would be able to say how many "reliable" (steady) systems are available now and, using stats, should be available in a defined future (of course, it will be more fuzzy as this future is far away). It could be a separate system based on browsers. Of course, not every user would have it enabled, so it would actually give a minimal reliable value, that the researchers could exceed at their own risks.

gunnarre · Post by **gunnarre** » Mon Mar 22, 2021 8:49 am

I think the viable ideas here are that of the client benchmarking the system the first time (and periodically in case the configuration changes), and also learning when the machine is usually active over time. If the client knows that it's for example usually only active 8 hours per day, then it might be assigned either shorter WUs which it can comfortably complete inside 8 hours, or longer WUs with multi-day expiry times.

Announcing folding schedules in advance and moving WUs between nodes sounds too complicated for 99% of users. That's more something you do if you sell your computing capacity in a cloud market or are managing a render farm. F@H should be almost set-and-forget, except that you see your points rolling in.

If you're folding in a virtual machine, you can already move the folding instance between hardware seamlessly, so there's no need to add that functionality to F@H, I think.

PS: Should we re-title and/or split the thread?

Edit by Mod: Just what I was planning to do.

ajm · Post by **ajm** » Mon Mar 22, 2021 9:36 am

gunnarre wrote:Announcing folding schedules in advance and moving WUs between nodes sounds too complicated for 99% of users.

Certainly so, but that one per cent sure would include all users producing a much larger part of points than 1%. It would be interesting for the few hundreds of people who are producing (tens of) millions of points per day, and the gain in productivity would largely exceed their representation.

EDIT: Besides, the moving between nodes could be as simple as a check box (possibly active per default) for the mere users of the function. It would be complicated only for (FAH and) the users who want to set up such swapping space. Standard users won't be bothered at all.

Post by **bruce** » Mon Mar 22, 2021 7:06 pm

When making an assignment, the Assignment server and Work Servers do not use information stored about you. There is no database describing each slot of each user/donor. FAHClient uploads a brief description of your hardware/software when it requests a new assignment (including information like number of CPU slots requesting work or GPUSpecies). The assignment process does not do a search of your stats so that information is not available and getting it would certainly slow down the assignment process significantly.

Tracking individual Donors is something FAH simply does not do. That's consistent with the current randomized assignment process. The 1% do NOT get special treatment, nor are they tracked.

If your system reports a RPi or an Intel iGP, that information can be used to give you an assignment with a long deadline ... but that would be a future development. Personally, I think that would be a good/simple idea, but even that probably can be expected to be unable to get a priority bump in the enhancement queue.

iero · Post by **iero** » Mon Mar 22, 2021 7:12 pm

*Kinda, out of topic, but didn't want to make a new thread just to ask, I think I have become an annoyance for long time members of the forum.* Is there any use for the data that one collects with the HFM.NET app? Is it uploaded automatically online? Can I manually upload it somewhere? It is even worth it?

gunnarre · Post by **gunnarre** » Mon Mar 22, 2021 7:27 pm

We are folding entusiasts like you, and you're not annoying. Here's someone asking for HFM.NET data to be uploaded: viewtopic.php?f=38&t=34510

If you have a website, you can use HFM.NET to show your statistics on that website, like this: http://www.miketimbers.com/hfm/summary.html (Howto: viewtopic.php?p=328337#p328337)

Otherwise there's no public database for HFM.NET data that I know of - you can look at https://folding.lar.systems for statistics collected from the Chrome dark mode folding exension.

Post by **bruce** » Mon Mar 22, 2021 9:42 pm

HFM is not part of FAH. The use of its data are subject to the restrictions the author places on his software.

FAH is not going to build its distribution system on the future performance of a 3rd party. FAH's ownership of FAH's data is based on its agreement to place FAH's data in the public domain.

ajm · Post by **ajm** » Tue Mar 23, 2021 7:01 am

bruce wrote:When making an assignment, the Assignment server and Work Servers do not use information stored about you. There is no database describing each slot of each user/donor. FAHClient uploads a brief description of your hardware/software when it requests a new assignment (including information like number of CPU slots requesting work or GPUSpecies). The assignment process does not do a search of your stats so that information is not available and getting it would certainly slow down the assignment process significantly.

Indeed, and it should stay that way. If the optimization of the assignment process requires a database, this DB should be built in the client, that is on the user's system, and only informations related/necessary to the assignment would be transmitted to FAH, just like today, but more precise, more detailed, and maybe more up-to-date.
FAH would not track that info, just use it. But think of what HFM or LAR_Systems could do with it, on a purely voluntary basis.

bruce wrote:Tracking individual Donors is something FAH simply does not do. That's consistent with the current randomized assignment process. The 1% do NOT get special treatment, nor are they tracked.

I don't think that the 1% would want or accept a "special treatment," but these people have shown that they are ready to go the extra mile (or the extra KW) and sure would appreciate doing even more, if they can and if the system stays in their hands. Right now, the only way I'm aware of is to participate in this forum or on Discord, and to become a beta tester. Offering swapping space to facilitate the transfer of checkpoints in their region (without congesting the servers) would be another option, more in line, so it seems to me, with the general mindset of big producers. This would optimize the functioning of FAH, not or not only that of their own kits. Such ideas would sure essentially appeal to the 1%, but would benefit the whole FAH project.

In the same vein, people controlling whole farms sure would appreciate add-ons that help them managing FAH on many kits at once. It could become a contest - with interested people developing a solution for this, that FAH would evaluate, then accompany to completion, and maybe, eventually, officially approve of or even integrate in their standard releases.

bruce wrote:If your system reports a RPi or an Intel iGP, that information can be used to give you an assignment with a long deadline ... but that would be a future development. Personally, I think that would be a good/simple idea, but even that probably can be expected to be unable to get a priority bump in the enhancement queue.

This is too slow and random. We should brainstorm and try to delineate ways to speed up this process, to find partners and backers. This should start, I think, by better communicating the needs and the goals, the vision of FAH, going forward. There should be some sort of public roadmap, something people can hold on to, that they know is a present concern, around which their input and help would be appreciated. And there should be real follow-ups on this. After a while, some larger projects would emerge that would interest journalists and vloggers, and the boat would start to get some sails.

HaloJones · Post by **HaloJones** » Tue Mar 23, 2021 10:10 am

To the OP's point, I hope that FAH is never pre-installed on a private computer or any such software. While the "project" might benefit, I will always apply 80/20 to such things. 80% of the benefit of FAH likely comes from the 20% of hardcore folders who run optimised kit 24/7 that therefore return results reliably and quickly.

I have some faith that FAH is worth the energy expenditure. But it's a constant worry that we're all trying to cure diseases to protect a species that is in desperate need of some kind of population constraint.

ajm · Post by **ajm** » Tue Mar 23, 2021 10:24 am

HaloJones wrote:I have some faith that FAH is worth the energy expenditure. But it's a constant worry that we're all trying to cure diseases to protect a species that is in desperate need of some kind of population constraint.

Not really, the curve is clearly flattening: https://population.un.org/wpp/Graphs/Pr ... OP/TOT/900 and is falling in some 55 countries (of at least 10% in 26 countries, and even 15% in a few ones). And to cure diseases is not just keeping people alive, it also elevates the quality, and thus the value, of life.

EDIT: The world has changed a lot since the 1970s, but many of us didn't notice, not even in the field of education: https://www.amazon.com/dp/B07BFDCWZP?pl ... vxx_0_6_im

ComputerUser · Post by **ComputerUser** » Tue Mar 23, 2021 12:27 pm

I just wonder if there is any announcement of the minimum system requirements for CPU folding? In the last years I always thought FAH is targeted to make use of (nearly) any hardware you already have and does not require the latest and greatest high end Threadripper, so I was confident that running a i5-4590 24/7 on all 4 threads is adequate (doing between 30k and 45k PPD). An hour ago this box got a P16815 WU with a timeout of only 0.23 days or 5.5 hours, but given the TPF of 4:01 min it will not finish before timeout. I'll let it run anyway, but any kind of performance index to be used during assignment would make sense. There are probably some high end CPUs out there where a CPU:4 slot would be fast enough, but what are they expecting?

By the way, I found the thread of this project in the beta forum and they are talking about 'dynamic timeout' - does anybody know some details how that timeout is calculated? It seems to change permanently - now psummary shows 0.24.

Cheers,
Computeruser

Post by **bruce** » Wed Mar 24, 2021 12:00 am

FAH's minimum requirement is CPUs and modern GPUs on Windows/Linux/MacOS with FP64 (Double Precision) hardware support.

FAH can support x86 CPU folding with any combination of x86 threads that support SSE2 or better. You don't need a threadripper-- but if you do have one, it should be given more challenging WUs than are assigned to weaker devices. The project owner can manage the number of threads that his project can be assigned to so say he decides CPU Threads <=12 (or some other number) you might not be assigned his project if you offer 30 thread device. Most of the time there are some numbers which are excluded because GROMACS has a high failure rate for certain numbers and FAHCore_a* doesn't give you a way to change that number if it crashes after the WU starts.

GPUs are supported which have been validated and restricted to ones with certain GPU_Species as listed in GPUs.txt. (That number is validated by the server so don't try editing that file.) Those species assignments are old and a rather crude method of control. The plan is to optimize those values based on benchmarks instead of hardware generations. Intel iGPs have been added but I wouldn't call them supported yet. Plans for them are still being developed.

ARM devices are not supported (yet?), though some have tried them with varying degrees of success. FAH has not announced code for ARM devices so they tend work in emulation mode.

.

Folding Forum

Thoughts on optimizing the assignment process

Re: Promotional Materials for F@H

Re: Promotional Materials for F@H

Re: Promotional Materials for F@H

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process

Re: Thoughts on optimizing the assignment process