Page 1 of 1

Options for running one job across multiple clustered Pi's?

Posted: Fri Sep 10, 2021 6:53 pm
by tvmwd6
I'm working on a pi cluster project and trying to figure out the most effective way to use it to run f@h.

I have completed and submitted a WU from a single Pi 3B+ in around 23hrs. Now I'd like to try distributing a WU across multiple Pi's in order to reduce the time to completion. I haven't found a conclusive answer whether f@h is MPI aware or not, if so that would be simple enough. Otherwise what options are available to me? Has anyone done this with non-ARM f@h?

Any insight is appreciated.

Re: Options for running one job across multiple clustered Pi

Posted: Fri Sep 10, 2021 7:31 pm
by MeeLee
Fah doesn't work like this.
If you would spread a WU across multiple units, data from one thread will have to move through an immensely slow network connection to another unit.
This data currently is shared amongst CPUs sharing super fast L-cache and fast RAM.
You'll be slowing down the WU by a multitude of 100 times.

The best option is to just load more WUs.
That way your WU will still process in 23 hours, but you now have multiple units, finishing in that time frame.
The best way to do that, is have an OS on each unit, and install FAH on each unit with your username and password.
And let them rip!

I myself use a cluster PC in that manner (2 towers of 20 units). Just x86 boards, not ARM boards, although I do have a 20 cluster ARM unit as well.

Re: Options for running one job across multiple clustered Pi's?

Posted: Fri Feb 03, 2023 5:18 pm
by ETA_2025
It's more than a year since it became possible to do F@H on ARM hardware, and yet there's no ability to process a WU on a cluster of ARM hardware, to enable more projects to be completed on ARM hardware.

Is there a fundamental technical limitation to processing one WU in parallel across multiple Pi's, such as a WU is required to be processed in serial, and thus can't be broken into smaller chunks for processing, or it a case of, new software needs to be created, to allow it?

Re: Options for running one job across multiple clustered Pi's?

Posted: Fri Feb 03, 2023 10:54 pm
by JimboPalmer
No just the dreadfully slow dara transfer.

Re: Options for running one job across multiple clustered Pi's?

Posted: Mon Feb 06, 2023 4:01 am
by Joe_H
JimboPalmer is correct. No new software would be needed to split up processing over multiple networked systems, but the the rate of data transfer over that network would be much too slow.

For each step through the processing of a WU the forces between the atoms all need to be calculated. Using a single thread this is done in one pass through the system. As threads are added, the system is decomposed into separate slices and the forces between the atoms in each slice are calculated by each thread. Then the forces between the atoms in each slice need to be done with those in adjacent slices. That is where a large amount of data transfer occurs determining those inter-slice forces. It is also part of the reason behind the adoption of the QRB years ago. That additional inter-slice computations adds overhead. Thus a single thread will take a certain amount of time to complete operations, two threads will take a bit more than half that time, and so on. Early on some found they could make more points doing two or more separate WUs than a single WU over multiple processors. But that meant all those single thread WUs took longer overall and delayed creation and processing of the next generation WU for that project's Run and Clone.