WUs not sent (are they gone?)

The most demanding Projects are only available to a small percentage of very high-end servers.

Moderators: Site Moderators, PandeGroup

Re: WUs not sent (are they gone?)

Postby wuffy68 » Fri Jun 06, 2014 12:58 am

and your spot price doesn't kill your VM before then, it should upload.


Yea, its a funny thing ... the zone spot price can go from $0.26/hr to $10/hr for just a fifteen minutes - then down to $0.26 again. Not sure how that happens but it's definitely got me being more creative with my instance management. AWS claims only 4% instance loss average over 24hrs, but I've seen more like 50% (even with my bid 20% above the going rate)

Thanks everyone for your help, I'll let the forum know how the recovery effort went tomorrow.
1x nVidia 1070, 1x nVidia 1060 3g,
1x nVidia 970, 2x nVidia 960,
1x nVidia 555, 1x AMD R7, 2x AMD 295,
6x i5 CPU-only rigs
wuffy68
 
Posts: 150
Joined: Wed Jun 04, 2014 11:06 pm
Location: Roxborough, Colorado USA

Re: WUs not sent (are they gone?)

Postby bruce » Fri Jun 06, 2014 5:17 am

wuffy68 wrote:I have another idea ...

Before the job expires, I can recover my server when it was 50% complete with this job and let it try to finish (again) ... see if this time around it actually gets sent.


This is only a useful idea if you're certain that the WU was never submitted and/or the WU never encountered an error which was reported to the server. You cannot submit the same WU twice, no matter how or why it happened.
bruce
 
Posts: 22623
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WUs not sent (are they gone?)

Postby wuffy68 » Fri Jun 06, 2014 5:47 am

bruce wrote:This is only a useful idea if you're certain that the WU was never submitted and/or the WU never encountered an error which was reported to the server. You cannot submit the same WU twice, no matter how or why it happened.


I think now things are good ... (BTW, can the same WU be assigned to the same user twice ... I thought each WU went out to two nodes, so the results are always compared and have to be identical, or is once enough?)

You probably saved me a few pennies :o ... so the job that's currently re-running on my original EC2 (50% complete) instance is actually the one that was already submitted:

Code: Select all
05:17:21:WU00:FS00:0xa5:Project: 8103 (Run 0, Clone 17, Gen 384)


and it looks like I got the other job back for re-processing (the one that failed to transmit previously):

Project: 8105 (Run 0, Clone 19, Gen 416)

Full Log:
02:34:37:WU01:FS00:0xa5:Folding@Home Gromacs SMP Core
02:34:37:WU01:FS00:0xa5:Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
02:34:37:WU01:FS00:0xa5:
02:34:37:WU01:FS00:0xa5:Preparing to commence simulation
02:34:37:WU01:FS00:0xa5:- Looking at optimizations...
02:34:37:WU01:FS00:0xa5:- Created dyn
02:34:37:WU01:FS00:0xa5:- Files status OK
02:34:39:WU01:FS00:0xa5:- Expanded 30294185 -> 33130012 (decompressed 109.3 percent)
02:34:39:WU01:FS00:0xa5:Called DecompressByteArray: compressed_data_size=30294185 data_size=33130012, decompressed_data_size=33130012 diff=0
02:34:39:WU01:FS00:0xa5:- Digital signature verified
02:34:39:WU01:FS00:0xa5:
02:34:39:WU01:FS00:0xa5:Project: 8105 (Run 0, Clone 19, Gen 416)
02:34:39:WU01:FS00:0xa5:
02:34:39:WU01:FS00:0xa5:Assembly optimizations on if available.
02:34:39:WU01:FS00:0xa5:Entering M.D.
02:34:46:WU01:FS00:0xa5:Mapping NT from 32 to 32
02:34:48:WU01:FS00:0xa5:Completed 0 out of 250000 steps (0%)
******************************* Date: 2014-06-06 *******************************
02:47:36:WU01:FS00:0xa5:Completed 2500 out of 250000 steps (1%)
03:00:28:WU01:FS00:0xa5:Completed 5000 out of 250000 steps (2%)
03:13:15:WU01:FS00:0xa5:Completed 7500 out of 250000 steps (3%)
03:26:06:WU01:FS00:0xa5:Completed 10000 out of 250000 steps (4%)
wuffy68
 
Posts: 150
Joined: Wed Jun 04, 2014 11:06 pm
Location: Roxborough, Colorado USA

Re: WUs not sent (are they gone?)

Postby P5-133XL » Fri Jun 06, 2014 6:53 am

Yes, WU's can be assigned multiple times to the same person/computer though it is not common. As long as it was assigned twice, you will get credit for it the second time too.
Image
P5-133XL
 
Posts: 4034
Joined: Sun Dec 02, 2007 4:36 am
Location: Salem. OR USA

Re: WUs not sent (are they gone?)

Postby wuffy68 » Fri Jun 06, 2014 6:58 am

Thanks for everyone's help ...

I'm concluding that manipulating AWS AMI images, volumes and snapshots for AWS spot instances is very unreliable (or I don't know what I'm doing).

After creating a new image of my current instance, AWS winds up auto-rebooting to different image than my last - with no rhyme or reason. I found lots of what I stated in the previous post is now just plain "bull honk".

I'm also concluding the system of backup and recovery for EC2 instances I've been attempting is not reliable enough to pursue further.

A better system would be to periodically pause the client and scp the FAH directories/files to a local directory (outside of AWS) ... when the instance dies, rebuild it and copy the latest files back in so work can continue where it left off on the new instance.

Apologies for the whirlwind of questions ... I'm going back to the drawing board on this one for a while :(
wuffy68
 
Posts: 150
Joined: Wed Jun 04, 2014 11:06 pm
Location: Roxborough, Colorado USA

Re: WUs not sent (are they gone?)

Postby 7im » Fri Jun 06, 2014 1:44 pm

P5-133XL wrote:Yes, WU's can be assigned multiple times to the same person/computer though it is not common. As long as it was assigned twice, you will get credit for it the second time too.


The correct answer is NO! A work unit is sent ONCE. If that work unit is completed before the deadline, the WU is not sent a second time for verification. Unlike BOINC projects that waste your donations by at least half, by requiring verification, fah has built in data error corrections so this is not needed.

The two exceptions are when a WU errors out, it will be assigned to another donor. Or if the WU is not returned before the timeout deadline, it is sent to another donor. There may be other exceptions, but this is the standard policy.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: WUs not sent (are they gone?)

Postby wuffy68 » Fri Jun 06, 2014 3:04 pm

7im wrote:The correct answer is NO! A work unit is sent ONCE.


Interesting - that's good to know (and really speaks to the value of F@H's system).
wuffy68
 
Posts: 150
Joined: Wed Jun 04, 2014 11:06 pm
Location: Roxborough, Colorado USA

Re: WUs not sent (are they gone?)

Postby wuffy68 » Sat Jun 07, 2014 8:49 am

wuffy68 wrote:I'm going back to the drawing board on this one for a while


This is a brief description of what works for me right now (for anyone interested in EC2):

https://foldingforum.org/viewtopic.php?f=55&t=26430#p265768

...
wuffy68
 
Posts: 150
Joined: Wed Jun 04, 2014 11:06 pm
Location: Roxborough, Colorado USA

Re: WUs not sent (are they gone?)

Postby bruce » Sat Jun 07, 2014 8:13 pm

wuffy68 wrote:
7im wrote:The correct answer is NO! A work unit is sent ONCE.


Interesting - that's good to know (and really speaks to the value of F@H's system).
The assumption that every WU has to be processed several times and the results voted on has never been part of FAH"s design, although other DC projects are known to use that method (assuming that they will be hacked or results will be falsified or that home PCs are highly likely to be unstable ... or whatever).

The quantity of UN-researched folding work that still needs to be done far exceeds the collective capabilities of the donors hardware and Stanford can't afford to waste half of those capabilities if there are better ways to collect reliable results.

If an error is encountered, the WU is assigned to a second client but the chances of that being assigned to the same person are practically nil. In any case, the server knows it's a duplicate and it's credited independently. Sometimes errors are inherent in a particular WU and sometimes it's a donor-based issue (such as overclocking) and there's no easy way to distinguish the cause. WUs which repeatedly fail on different clients are assumed to be due to faulty data and are discarded.
bruce
 
Posts: 22623
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WUs not sent (are they gone?)

Postby toTOW » Thu Sep 11, 2014 7:47 pm

Is BigAdv still active ? If the answer is yes, what are the current requirements ?
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8766
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: WUs not sent (are they gone?)

Postby bollix47 » Thu Sep 11, 2014 7:58 pm

Yes, but only until Jan 31/2015:
minimum number of cores - 24 ... if only 24 and they are hyper-threaded anything less than 2.67 Ghz might not make deadlines and even 2.67 won't make them if using older technology like a dual xeon 5650

v7 options:
client-type bigadv
max-packet-size big
Image
bollix47
 
Posts: 3493
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Previous

Return to SMP with bigadv

Who is online

Users browsing this forum: No registered users and 1 guest

cron