Page 1 of 1

Multiple runs of same WU

Posted: Sun Jan 17, 2021 8:55 am
by SilvioMartin
I noted that several WUs are run multiple times. This one e.g. was run 5 times and all runs are reported to be Ok:

https://apps.foldingathome.org/wu?p=169 ... ne=9&gen=0

I first thought, that this is maybe done in order to make sure that WUs are processed in time (the F@H folks may have statistics how many attempts fail), but here the last one was assigned almost 5 days after the first one was finished. So this is probably not the reason. Another idea is that maybe the results of a single run are not reliable and need to be confirmed with more runs. But there are also WUs, which are run only once, e.g. this one from October:

https://apps.foldingathome.org/wu?p=168 ... 752&gen=46

So my question is: Why are some WUs run multiple times and some are not?

Re: Multiple runs of same WU

Posted: Sun Jan 17, 2021 10:23 am
by PantherX
Generally speaking, WUs are initially only assigned once. It will be re-assigned one more time if it doesn't get it back before the Timeout.
If a WU returns an error, it will be reassigned twice. This will continue until it has either received 3 failures or have assigned it 5 times in which case, it will automatically be stopped from future assignment.

Occasionally, a researcher may have to "restart" a Project but that generally happens in the initial phase (internal/beta) and doesn't normally happen once it is in full release.
Every now and then, a completed WU is returned to the CS (Collection Server) instead of the WS (Work Server) but the CS doesn't inform the WS due to technical issues and thus, the WS thinks it has lost the WU so reassigns it.

I think that the Server hosting Project Series 16900 had some technical issues recently which could explain why some of the WUs were re-assigned.

Re: Multiple runs of same WU

Posted: Sun Jan 17, 2021 3:46 pm
by SilvioMartin
Makes sense. Thanks for the detailed answer.

Re: Multiple runs of same WU

Posted: Sun Jan 17, 2021 9:45 pm
by bruce
SilvioMartin wrote:I noted that several WUs are run multiple times. This one e.g. was run 5 times and all runs are reported to be Ok:
In addition to what PantherX has already said, there's a new assignment concept being evaluated/developed where the timeout adjusts itself to improve the rate of progress of the trajectory. (It will be officially announced if and when it can be tweaked to work as desired.) The fact that several of those WUs received the baseline points (zero bonus) indicates that some of the hardware was especially slow but other hardware completed it much faster.

There is a point at which slow hardware can delay the scientific progress rather than contributing to the overall science. Finding that point and managing that effect is the project's goal. Inasmuch as there is an overall shortage of WUs right now, it's a good time for experimenting with new ideas.