NaCL-support for GPU in future?

Moderators: Site Moderators, PandeGroup

Re: NaCL-support for GPU in future?

Postby bruce » Tue Feb 14, 2017 9:41 pm

foldinghomealone wrote:I reason that one 100% fast GPU + one 10% GPU is faster than a single 100% GPU.


Incorrect reasoning ... due to the serial nature of segments of a trajectory.

Think of it as a relay race. Each runner completes a segment of the total race. Only one person can run at a time, whether they're running at 100% or at 10%. Nobody can run leg 2 until the runner on leg 1 can pass the baton to the next runner.

Adding a slow runner to the team SLOWS DOWN the ability of the team to win the race. The slow runner's effort is useless until he can pass the baton to the next runner.
bruce
Site Admin
 
Posts: 20631
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: NaCL-support for GPU in future?

Postby foldinghomealone » Tue Feb 14, 2017 10:26 pm

I understand what you mean.

However your statement would mean that if only one donor would not continue to process the WU then the whole system would be stopped until deadline and the WU is reassigned. But this is not the case.
It also would mean that I would have to wait for WUs so that I can start processing. That's also not the case. WUs wait to be downloaded. (At least it seems so). And waiting time is wasting time which could be used with slow GPUs.
Therefore there has to be some kind of parallelity involved.

And you speak of a race with several teams. Several teams means parallelity.
I understand FAH more like a charity run where each mile of all runners is donated to a charity.
Maybe the fastest team is slowed down a bit but you earn much more miles for all teams together when you add one or several teams of slow runners.
foldinghomealone
 
Posts: 24
Joined: Wed Feb 01, 2017 7:07 pm

Re: NaCL-support for GPU in future?

Postby Joe_H » Tue Feb 14, 2017 10:53 pm

There are many parallel "relay races" in progress at any time. Each Project consists of multiple Runs, and there are multiple Clones of each Run. It is the processing of the individual Generations that is serial, Gen 1 has to be completed before Gen 2 and so on. The initial conditions for each Gen are set by the state at the end of the previous Gen, so the WU can not be created in advance. So at any time there are WU's from the same project or another one available and a GPU does not have to wait for an assignment.

The researchers however do have to wait for those 100, 200 or more Generations to complete. If there were many slow machines, then you would end up with many if not all of those parallel runs slowed down by significant amounts.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Super Moderator
 
Posts: 3506
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: NaCL-support for GPU in future?

Postby bruce » Tue Feb 14, 2017 10:58 pm

foldinghomealone wrote:However your statement would mean that if only one donor would not continue to process the WU then the whole system would be stopped until deadline and the WU is reassigned. But this is not the case.


This, in fact, is mostly true.

WUs have two deadlines. The final deadline is the point at which FAH will no longer give you credit for completing the WU. The Preferred Deadline is the point at which FAH assumes that the WU is "lost" and it reassigns the WU to someone else. This setting is supposed to be long enough to avoid unnecessary duplication but short enough to avoid waiting a long time for a WU which will never be returned. (In some instances, a WU is returned after the preferred deadline and before the final deadline, leading to a duplication of work.)
bruce
Site Admin
 
Posts: 20631
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: NaCL-support for GPU in future?

Postby bruce » Tue Feb 14, 2017 11:12 pm

Development did explore the concept of uploading a partial WU for CPU-based WUs. That project never made it to fruition. I don't know why. I don't think it was ever tried for GPUs.

It was called a "streaming client" because each client was continually uploading progress in tiny pieces of a trajectory and when you abandoned that WU, somebody else could pick up the baton from the last point that you reported your progress.

One disadvantage was what happened when the number of active participants varied. Suppose at a given time, there are between 50 and 1000 participants folding. Obviously there have to be at least 1000 WUs running in parallel. Suppose at a particular time, your client asks to start work. If you happen to be number 51, the project will assign you the most complete trajectory that's not being worked on (excluding any that may have reached the desired end-point). The project owner establishes a minimum number (less that 1000) of trajectories that must be completed. Say the number is 400. When the 400th trajectory is completed, all of the incomplete (shorter) trajectories can be discarded. The work completed by everybody who has worked on the other 600 trajectories is probably discarded.

It's not a very efficient system.
bruce
Site Admin
 
Posts: 20631
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: NaCL-support for GPU in future?

Postby 7im » Tue Feb 14, 2017 11:41 pm

Yes, there is a good deal of parallelism involved. Let's make it simpler. Consider a river. The deep narrow river flows very fast. The wide shallow river tends to just meander along. Faster is better in Folding.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 15204
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: NaCL-support for GPU in future?

Postby foldinghomealone » Wed Feb 15, 2017 12:21 am

Ok, thanks for all this information. I have to think it through to full understand.

But what I don't understand at all is why there can be very small WUs for CPU but not for GPU.
foldinghomealone
 
Posts: 24
Joined: Wed Feb 01, 2017 7:07 pm

Re: NaCL-support for GPU in future?

Postby bruce » Wed Feb 15, 2017 8:03 am

It all boils down to the number of parallel operations that the hardware can handle.

Most CPU on home computers can process up to about a dozen floating point operations in parallel (one per CPU thread). Server based hardware may manage 48 or 64. (older CPUs might have 4 and rarely 1 or 2, but most of those machines have been retired.)

Modern GPUs can process as many as as 3500 floating point operations in parallel, with older GPUs managing maybe 640 or even less. Fundamentally, the speed of processing depends pretty directly on those numbers (and less-so on the clock speed). Bigger proteins (with lots of atoms) would take "forever" on a CPU but they're well suited for GPUs which can performing as many operations in parallel as possible.

Intermediate proteins that would be a good fit for GPU hardware with, say 32 or 48 shaders (parallel compute components) could conceivably be assigned to either one but the CPU is going to be more efficient because the GPU software has extra overhead because it has to be constantly moving the data between main RAM and the GPUs VRAM. Deadlines on GPU projects are generally tighter and a GPU with less than a few hundred shaders will probably miss those deadlines whereas the deadline for that project would accommodate the lower capabilities of a CPU.
bruce
Site Admin
 
Posts: 20631
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Previous

Return to NaCl client (Chrome);

Who is online

Users browsing this forum: No registered users and 2 guests

cron