Feature Request: Pause at next checkpoint

Moderators: Site Moderators, FAHC Science Team

Re: Feature Request: Pause at next checkpoint

Postby foldinghomealone2 » Sun Apr 12, 2020 8:29 pm

iceman1992 wrote:
foldinghomealone2 wrote:Therefore my opinion is to fold with power efficient hardware only and to fold dedicated. (not to be understood as to fold with dedicated rigs).
With dedicated I mean that you take a WU and you fold it as fast as possible and return it.

Okay, but that's a bit of a paradox. By your definition, folding dedicated is not dedicated rigs, but if someone uses the machine while it's folding, it will slow the progress down, depending on what they're doing it can almost stop the progress completely. So nobody should use the machine while it's folding. That makes it a sort-of dedicated rig.


dedication
/dɛdɪˈkeɪʃ(ə)n/
noun
1.
the quality of being dedicated or committed to a task or purpose.

I have a GPU because I game.
But I won't game and fold at the same time (it slows down both processes) and I finish a WU first, then I start to game.
I don't start folding, then pause, then game, then start over folding.
foldinghomealone2
 
Posts: 148
Joined: Sun Jul 30, 2017 9:40 pm

Re: Feature Request: Pause at next checkpoint

Postby Knish » Sun Apr 12, 2020 9:15 pm

I'm guessing u haven't seen the movie Lucky Number Slevin
Knish
 
Posts: 74
Joined: Tue Mar 17, 2020 6:20 am

Re: Feature Request: Pause at next checkpoint

Postby foldinghomealone2 » Sun Apr 12, 2020 9:31 pm

Kansas City Shuffle.
However I can't follow you.
foldinghomealone2
 
Posts: 148
Joined: Sun Jul 30, 2017 9:40 pm

Re: Feature Request: Pause at next checkpoint

Postby PantherX » Sun Apr 12, 2020 11:31 pm

The current limitation AFAIK, it the lack of identifying fast GPUs from slow GPUs. Currently, the system only knows the GPU architecture and that's it. Maybe in a future, the F@H Servers can identify the exact GPU model and assign a WU that is best suited for it's speed. That would a win-win situation where the fast GPUs get huge proteins to fold while the slow GPUs get smaller proteins to fold.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6323
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Feature Request: Pause at next checkpoint

Postby bruce » Sun Jul 05, 2020 6:02 am

uyaem wrote:
iceman1992 wrote:
foldinghomealone2 wrote:Hence my! general recommendation: never ever pause a GPU slot.

Yeah no that's not realistic

A log notification about hitting a savepoint would be cool. :)


A new feature in the most recent versions of FAHCore_22 is a notification at the beginning of the WU saying:
Code: Select all
22:04:06:WU00:FS02:0x22:  Checkpoint write interval: xxx00 steps (5%) [20 total]

Although the reported values will change based on what the PI has set for his project, at least if I know it's every 5% for this project and I can estimate the time until the beginning of the next 5% interval.
bruce
 
Posts: 19649
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Feature Request: Pause at next checkpoint

Postby JohnChodera » Sun Jul 05, 2020 6:38 am

Great idea! We'd need to add something to the client to instruct the cores when to stop.

Could someone submit this to https://github.com/foldingathome/fah-issues/issues so we can get that into the queue for client features?

> A log notification about hitting a savepoint would be cool. :)

That's easy enough to add to core22! I'll see if I can add that in 0.0.12.

~ John Chodera // MSKCC
User avatar
JohnChodera
Pande Group Member
 
Posts: 306
Joined: Fri Feb 22, 2013 10:59 pm

Re: Feature Request: Pause at next checkpoint

Postby Knish » Wed Jul 08, 2020 11:08 am

already done back on page 2 i believe viewtopic.php?f=16&t=34239&start=15#p325121
Knish
 
Posts: 74
Joined: Tue Mar 17, 2020 6:20 am

Re: Feature Request: Pause at next checkpoint

Postby ajm » Wed Jul 08, 2020 11:23 am

Why not just write a checkpoint whenever a slot or a kit is paused?
ajm
 
Posts: 495
Joined: Sat Mar 21, 2020 6:22 am
Location: Lucerne, Switzerland

Re: Feature Request: Pause at next checkpoint

Postby Frogging101 » Wed Jul 08, 2020 7:25 pm

ajm wrote:Why not just write a checkpoint whenever a slot or a kit is paused?


The CPU cores do write when they are paused or are otherwise gracefully shut down.

The GPU cores, as I understand it, operate differently. The CPU sends work to the GPU in large "chunks" (there's probably a correct term for this, but I don't know it), and reads the output from each chunk when the GPU finishes processing it. And a "chunk" is either processed in full, or it isn't. When a GPU slot is paused and the core is shut down, the current "chunk" that the GPU is working on is abandoned. AFAIK, this is just how GPU compute works; it's most efficient when it can run an algorithm in parallel on a large amount of input at once. It doesn't run piecemeal bits of data back and forth with the CPU.

Essentially, with GPU processing, there's more work "in flight" at a given time, so if it gets cancelled, more work is lost. That's a tradeoff of GPU computing.

Note: This is just my understanding of how GPU computing works. I'm a software engineer, but I haven't done any in-depth work in this area. Please correct me if I got things wrong :)
Frogging101
 
Posts: 66
Joined: Wed Mar 25, 2020 3:39 am
Location: Canada

Re: Feature Request: Pause at next checkpoint

Postby bruce » Wed Jul 08, 2020 10:29 pm

GPU "chunks" = kernels.

FAHCores cannot write a checkpoint at an arbitrary point in the the procedure so not whenever you decide to pause would be a bad time. Checkpoint frequencies can be set by the project designer.

Both GROMACS (Core_a7) and OpenMM (Core_2x) will back up to the previous suitable break-point.
bruce
 
Posts: 19649
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Previous

Return to Discussions of General-FAH topics

Who is online

Users browsing this forum: No registered users and 2 guests

cron