Page 1 of 2

Folding@home cluster?

PostPosted: Thu Apr 26, 2012 6:42 am
by jupiter
Has any one considered writing software to cluster two or more home pc's to work on one project at a time? Pros and cons?

Older pc's clustered could gain a new lease on their their life.
Newer pc's clustered could get the bigger work units done more quickly.

Thoughts?

Re: Folding@home cluster?

PostPosted: Thu Apr 26, 2012 6:52 am
by iceman1992
Good idea. I'm not sure if that's feasible though. Will be awesome if it happens, I have several PCs at home that are too slow by themselves.

Re: Folding@home cluster?

PostPosted: Thu Apr 26, 2012 6:56 am
by Jesse_V
From what I've read, I get the impression that F@h will do best if it runs on each individual machine.

Doing a forum search for "cluster", I found this related thread wherein the admin Bruce provided some useful answers: viewtopic.php?f=61&t=20917

Re: Folding@home cluster?

PostPosted: Thu Apr 26, 2012 7:14 am
by bruce
I've provided comments about clusters in several different topics.

As I said there, FAH only runs on individual machines. Any of the machines that you might harness together in a cluster is probably fast enough to run the Uniprocessor client, but each machine will be working on a different WU. My laptop is an old dual-core machine and it's just fast enough to meet the dual-core deadline for SMP, but there's no way to cluster it with other machines. If software were available to do that, it would spend most of it's time waiting for data to move across whatever connection I would have between the machines. FAH moves massive amounts of data between the threads and you wouldn't be able to keep multiple nodes busy. The SMP client REQUIRES a memory-to-memory data transfers, not something slow like a network to interconnect the nodes. It's only efficient if they're working on separate WUs.

Re: Folding@home cluster?

PostPosted: Thu Apr 26, 2012 7:50 am
by iceman1992
bruce wrote:My laptop is an old dual-core machine and it's just fast enough to meet the dual-core deadline for SMP, but there's no way to cluster it with other machines.
Dual-core deadline? Are the deadlines different for different numbers of cores?

bruce wrote:If software were available to do that, it would spend most of it's time waiting for data to move across whatever connection I would have between the machines. FAH moves massive amounts of data between the threads and you wouldn't be able to keep multiple nodes busy.
Very good point. Just like the main problem for supercomputers, the interconnection between nodes needs to be very fast.

Re: Folding@home cluster?

PostPosted: Fri Apr 27, 2012 1:01 am
by jupiter
Thank you Bruce. That puts that issue to rest in my mind.

Re: Folding@home cluster?

PostPosted: Fri Apr 27, 2012 1:27 am
by bruce
iceman1992 wrote:
bruce wrote:My laptop is an old dual-core machine and it's just fast enough to meet the dual-core deadline for SMP, but there's no way to cluster it with other machines.
Dual-core deadline? Are the deadlines different for different numbers of cores?


Not for the same project, but there are projects designed for slower and faster machines. The options that you can choose are uniprocessor (longest deadlines for the slowest computers), smp (shorter deadlines for every-day computers), and bigadv/advanced (shortest deadlines for the top several percent of server-class hardware). Points vary based on those classifications, too.

The client doesn't actually measure the GHz of the computer but it does report the number of cores. The Assignment process can't actually take overall speed of your hardware into consideration when making assignments but it may use the core-count as an approximztion to overall speed. There used to be SMP projects that were assigned to machines with 2 cores and others which required more (e.g.: 4 or 6 or 8).

On the serverstat page, http://fah-web.stanford.edu/serverstat.html, you'll find some data in the columns called smp cores and min smp. The idea is that when the assignment process has a choice of projects, it should give me the best fit for my hardware by excluding any projects that my dual-core machine wouldn't be able to complete.

Re: Folding@home cluster?

PostPosted: Fri Apr 27, 2012 4:32 am
by RLHay
Bruce, when a work unit is returned the elapsed time is obviously returned as well to calculate bonus points. Could TPF (or some other suitable parameter) also be returned, to assist in fine tuning projects that are pushed out?

Re: Folding@home cluster?

PostPosted: Fri Apr 27, 2012 5:10 am
by bruce
I'm not sure what data you're looking for. If a machine runs 24x7, the TPF is known from the time it took to complete the WU. Knowing both the TPF and the time from download to upload, the server could calculate the percentage of time the machine was shut down or the time FAH was down while gaming or the time that the WU tried to upload but failed. I don't see that the WU should be tuned to account for those things.

Elapse_time = Upload_time-Download time.

Elapsed_time / 100 = TPF.

Re: Folding@home cluster?

PostPosted: Thu May 03, 2012 2:32 am
by iceman1992
Hey bruce, I just saw this on the SMP FAQ, http://folding.stanford.edu/English/FAQ-SMP#ntoc19
Does that mean that FAH could support multi-box clusters?

That's on our mind, but we want to try to get SMP working smoothly before going to far in that direction.
Then a FAH cluster is still possible?

And btw I think that's a typo, should be "going too far"

Re: Folding@home cluster?

PostPosted: Thu May 03, 2012 2:45 am
by Zagen30
I'm pretty sure that entry is from several years ago, when they were first introducing SMP; SMP has been working pretty smoothly for some time from everything I've seen. That doesn't mean that cluster support will never ever happen, but I don't believe it's anywhere near the top of the list of priorities.

Re: Folding@home cluster?

PostPosted: Thu May 03, 2012 2:56 am
by bruce
The performance of FAH's SMP client seems to synchronize with the slowest thread. That, and the terribly slow data interchange between your nodes probably would kill any productive use of real clusters. I don't think anybody really appreciated that information when that topic was written -- but I could be wrong about both.

It's my understanding that the stand-alone version of Gromacs can run on clusters but that has never been packages in a way that it could work as a part of FAH. It has always been my understanding that if it were available to FAH donors, it would work very poorly due to the limitations in interconnect speed. (I've said so many times, myself.) FAH would find very, very few real clusters in the home environment, so it would be a support headache, and would probably never pay for the investment.

If you feel clustering might be a real improvement to FAH, feel free to suggest it in this topic: Discussion: what is holding F@h back?. They're coming up with some creative ideas and it might be a good chance to flesh out the ideas.

Re: Folding@home cluster?

PostPosted: Thu May 03, 2012 3:04 am
by iceman1992
bruce wrote:If you feel clustering might be a real improvement to FAH, feel free to suggest it in this topic: Discussion: what is holding F@h back?. They're coming up with some creative ideas and it might be a good chance to flesh out the ideas.
Hmm personally I agree with you, the interconnect speed will be a major problem. I'm just thinking, for a guy with a folding farm (say, 6 i7 PCs), wouldn't it be better if they all work on one unit, rather than 6 different ones? If that was the case then 6 i7s would meet the bigadv requirement without needing server-class hardware. (again it's just a thought, without considering the interconnect problem)

Re: Folding@home cluster?

PostPosted: Thu May 03, 2012 4:36 am
by Punchy
The old MPI-based SMP core scaled very nicely on a single-operating-system-image cluster using a high-speed interconnect. The new ones based on shared memory scale very poorly. PPD/$ is a loser in either case due to the high cost of the interconnect.

Re: Folding@home cluster?

PostPosted: Sat Jun 01, 2013 1:37 pm
by blub
Would It be possible to Use an PCI-Express Slot for Interconection? I don't know PCI-E well enough to tell if PC-PC connections are possible, but PCI-E 3.0 x16 would provide upt to 15Gigabytes/second, about 120x as fast as Gigabit Lan. On the other side why bother with CPU folding anymore, when it is possible to fold anything on a GPU?