Does selecting Quit from the systray menu save files?

This client will only use a single CPU

Moderators: Site Moderators, PandeGroup

Does selecting Quit from the systray menu save files?

Postby nogginthenog » Wed Jan 11, 2012 7:24 pm

At present I wait until a client just goes over to a new % complete, can I just quit and have the progress saved ?
nogginthenog
 
Posts: 38
Joined: Mon Nov 28, 2011 3:42 pm

Re: Does selecting Quit from the systray menu save files?

Postby 7im » Wed Jan 11, 2012 8:11 pm

The short answer is the client (fahcore) saves it's place at regular intervals, so you can quit any time you want.

The long answer is that most fahcores do not write a checkpoint when they exit. But you only lose a few minutes on average, so not a big deal. If you stop and start several times a day, then set the checkpoint at 5 minutes instead of 15. Otherwise just leave it at 15.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Does selecting Quit from the systray menu save files?

Postby bruce » Wed Jan 11, 2012 9:49 pm

The long answer is "it depends"

Some cores write a checkpoint whenever there is a new % complete message (plus perhaps every TBD minutes (where TBD is probably 15) if a frame takes longer than TBD minutes) -- and for that core, you would be minimizing the amount of processing that would be repeated by waiting for that event. Other cores write a checkpoint every TBC minutes from the beginning of that WU and the % Complete messages are purely cosmetic. Other cores have had other methods and you'd need to figure them out if your goal is to minimize repeated work.
bruce
 
Posts: 21696
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Does selecting Quit from the systray menu save files?

Postby nogginthenog » Wed Jan 11, 2012 9:54 pm

OK thanks
nogginthenog
 
Posts: 38
Joined: Mon Nov 28, 2011 3:42 pm

Re: Does selecting Quit from the systray menu save files?

Postby Stonecold » Sun Jan 15, 2012 6:24 pm

7im wrote:The long answer is that some fahcores do not write a checkpoint when they exit. But you only lose a few minutes on average, so not a big deal. If you stop and start several times a day, then set the checkpoint at 5 minutes instead of 15. Otherwise just leave it at 15.


Does reducing the time between checkpoints reduce its performance?
Stonecold
 
Posts: 392
Joined: Sun Dec 25, 2011 9:20 pm

Re: Does selecting Quit from the systray menu save files?

Postby 7im » Sun Jan 15, 2012 11:30 pm

If you fold 24/7, and you set the checkpoing at 3 minutes versus 30 minutes, then yes, the difference in easily measurable, but very small. Only seconds per day, but that does add up to a lot of time, when you have 400,000 running for a few years... ;)

Also note that when I tested this (on the exact same work unit, using the exact same frames, exact same unused PC), it was on regular fahcore_78 CPU work units. The difference could be very different for larger work units in SMP. I don't know.

15 minutes is the recommended default for a reason. :)
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Does selecting Quit from the systray menu save files?

Postby Stonecold » Mon Jan 16, 2012 12:32 am

I set mine to 10 minutes, but I often run very unstable software alongside Folding@home so it's not uncommon for me to open my laptop and have it greet me with a great big blue screen. I've been thinking about reducing the checkpoint frequency to 5 minutes, but I don't know how long it takes for the client to write the checkpoint to disk. Would it be any faster if it wrote the checkpoint file to a solid-state medium?
Stonecold
 
Posts: 392
Joined: Sun Dec 25, 2011 9:20 pm

Re: Does selecting Quit from the systray menu save files?

Postby Nantes » Sat Jan 21, 2012 7:16 pm

7im wrote:If you fold 24/7, and you set the checkpoing at 3 minutes versus 30 minutes, then yes, the difference in easily measurable, but very small. Only seconds per day, but that does add up to a lot of time, when you have 400,000 running for a few years... ;)


But setting 30 minutes instead of 3 could potentially mean 27 extra minutes of lost work if you shut off right before it was supposed to checkout. I believe the sum time of all work lost this way is larger than the sum time you described. Therefore isn't it better to set the checkpoint frequency to lower?
Nantes
 
Posts: 66
Joined: Fri Jan 06, 2012 2:56 pm

Re: Does selecting Quit from the systray menu save files?

Postby Stonecold » Sat Jan 21, 2012 7:39 pm

There should be a "stop after next checkpoint" button that would terminate the WU as soon as the checkpoint is passed (kind of like the "finish" button). That way no one would have to worry about shutting it down right before it was going to hit the next checkpoint. It would be even better if it displayed the time until the next checkpoint. Would that be possible to implement? Does the client know when the core's planning to write a checkpoint (I heard that it's the core that writes the checkpoint, not the client)?
Stonecold
 
Posts: 392
Joined: Sun Dec 25, 2011 9:20 pm

Re: Does selecting Quit from the systray menu save files?

Postby bruce » Sat Jan 21, 2012 7:45 pm

If you stop and start folding frequently, shorter checlpoint intervals can save a few minutes of duplicated folding time. If your machine ever crashes or your power ever fails, shorter checlpoint intervals is a bad idea. When a FahCore writes a checkpoint, FAH finishes turning the data over to the OS and goes back to work, but for a short time after the checkpoint is finished, data is still in cache (either in RAM or in a cache that may be part of your disk drive). Periodically, the OS actually flushes the cache, causing the data to actually be written to disk. If your system crashes at any time while there's still data in the cache, you will have created a corrupt checkpoint and the WU will restart FROM THE BEGINNING. The percentage of time that cache data is vulnerable depends on which OS you run and is very different, depending on which filesystem structure you use. For a FAT filesystem (does anybody still use them?) I would set my checkpoint interval to 15 or even 30 minutes, simply to reduce the frequency of the times that the cache is vulnerable. For a Journaled File Systems, the risk is almost non-existent so I'd probably use 5 or 10. I just can't bring myself to use 3. Nobody restarts frequently enough to need that.
bruce
 
Posts: 21696
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Does selecting Quit from the systray menu save files?

Postby Stonecold » Sat Jan 21, 2012 7:52 pm

Oh. I've set my checkpoint lower because my computer crashes more often than it use to (I'm running many resource-intensive applications at once)...

You said the operating system and file system are important variables, so if I use Windows 7 Home Premium SP1 and have NTFS file system, then what percentage of time is the cache data vulnerable in the case that I have to hard-reboot my computer?
Stonecold
 
Posts: 392
Joined: Sun Dec 25, 2011 9:20 pm

Re: Does selecting Quit from the systray menu save files?

Postby bruce » Sat Jan 21, 2012 8:06 pm

Stonecold wrote:There should be a "stop after next checkpoint" button that would terminate the WU as soon as the checkpoint is passed (kind of like the "finish" button). That way no one would have to worry about shutting it down right before it was going to hit the next checkpoint. It would be even better if it displayed the time until the next checkpoint. Would that be possible to implement? Does the client know when the core's planning to write a checkpoint (I heard that it's the core that writes the checkpoint, not the client)?


I like that idea and have though of recommending it, myself.

Yes, the core writes the checkpoint. The client knows nothing about the Core except that A) it's still running or that B) it just ended with a status code of X.

Additional code could be written that monitors the time that the last checkpoint files were written but the various cores are different, so the code would have to either understand all of those variations or be built into the core itself. (They're certainly not going to retrofit the older cores.) Monitoring those files would add some overhead to FAH, and although I'm not qualified to say how much it would add, I guarantee some people would complain "Why is this extra process running and using time that could be spent folding?" Would the programming time be a good investment? Certainly not for Stanford to write ... but if you've got the programming skills, you're welcome to come up with code that meets that need.

There's an alternative, though. In V6, you can set -verbosity 9. In V7 you can set "extra-core-args" to "-verbose" (without the quotes). Most cores (maybe all of them) will issue a message when checkpoints occur, although you'll have to learn the peculiarities of that each specific core that you run.

For example, here's the output produced by FahCore_78 when you increase the verbosity.
Code: Select all
18:29:29:Completed 50000 out of 250000 steps  (20%)
18:44:29:Timered checkpoint triggered.
18:59:29:Timered checkpoint triggered.
19:02:47:Writing local files
19:02:47:Completed 52500 out of 250000 steps  (21%)
19:17:47:Timered checkpoint triggered.
19:32:47:Timered checkpoint triggered.
19:36:07:Writing local files
19:36:08:Completed 55000 out of 250000 steps  (22%)
19:51:08:Timered checkpoint triggered.
20:06:08:Timered checkpoint triggered.


FahCore_78 writes a checkpoint for every frame "Writing local files." plus it writes extra checkpoints every 15 minutes ("Timered checkpoint triggered.") after those frame-based checkpoints. DO NOT assume that another core behaves the same way.
bruce
 
Posts: 21696
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Does selecting Quit from the systray menu save files?

Postby Stonecold » Sat Jan 21, 2012 8:24 pm

bruce wrote:Yes, the core writes the checkpoint. The client knows nothing about the Core except that A) it's still running or that B) it just ended with a status code of X.

Then how can you set the checkpoint frequency from the client if it has no control over it? Does the core simply try to write checkpoints at the minimum interval but the client refuses to allow that until the time it has been set to is up?

bruce wrote:There's an alternative, though. In V6, you can set -verbosity 9. In V7 you can set "extra-core-args" to "-verbose" (without the quotes). Most cores (maybe all of them) will issue a message when checkpoints occur, although you'll have to learn the peculiarities of that each specific core that you run.

So do I set that in the "extra core options" in the expert tab of the configuration options like this or would I do it in the "extra client options"?

There should be a way to set the core to write a checkpoint at each frame, that way people would know approximately when the next checkpoint is going to occur. I think that would be a nice feature, too.
Stonecold
 
Posts: 392
Joined: Sun Dec 25, 2011 9:20 pm

Re: Does selecting Quit from the systray menu save files?

Postby bruce » Sat Jan 21, 2012 8:45 pm

Then how can you set the checkpoint frequency from the client if it has no control over it? Does the core simply try to write checkpoints at the minimum interval but the client refuses to allow that until the time it has been set to is up?
The Client is responsible for starting the FahCore. One of the pieces of information that is used as a calling argument is the checkpoint interval. If you change the checkpoint interval in the client, it applies to the next WU unless you stop and restart work on the current WU.
There should be a way to set the core to write a checkpoint at each frame, that way people would know approximately when the next checkpoint is going to occur.

The FahCores contain analysis code that is obtained from various sources. FAH is not the only place that the code is used. If that code does not write a checkpoint at each frame, then you'll have to talk to the group that designed the analysis code. Your idea of what "should" happen doesn't necessarily apply to all the researchers that run the stand-alone Molecular Design code on their supercomputer.

I think you missed what 7im said:
7im wrote:The short answer is the client (fahcore) saves it's place at regular intervals, so you can quit any time you want.

The long answer is that most fahcores do not write a checkpoint when they exit. But you only lose a few minutes on average, so not a big deal. If you stop and start several times a day, then set the checkpoint at 5 minutes instead of 15. Otherwise just leave it at 15.
bruce
 
Posts: 21696
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.


Return to Windows Classic V6.23 Client

Who is online

Users browsing this forum: No registered users and 1 guest

cron