Folding 4 Lost Data {NO}[SOLVED]

Moderators: Site Moderators, PandeGroup

Folding 4 Lost Data {NO}[SOLVED]

Postby new08 » Mon Feb 04, 2008 12:18 pm

Last week I struggled to complete a slow unit that took 5 days to run.
I lost the taskbar icon on the way and eventually my pc froze-but I was determined to save 4days work and let it run in background as it had been doing till then.
[19:45:52] Loaded queue successfully.
[19:45:52] Initialization complete
[19:45:52] + Benchmarking ...
[19:45:58]
[19:45:58] + Processing work unit
[19:45:58] Core required: FahCore_78.exe
[19:45:58] Core found.

19:45:58] Working on Unit 08 [February 3 19:45:58]
[19:45:58] + Working ...
[19:46:15]
[19:46:15] *------------------------------*
[19:46:15] Folding@Home Gromacs Core
[19:46:15] Version 1.90 (March 8, 2006)
[19:46:16]
[19:46:16] Preparing to commence simulation
[19:46:16] - Looking at optimizations...
[19:46:17] - Created dyn
[19:46:17] - Files status OK
[19:46:17]
[19:46:17] Folding@home Core Shutdown: MISSING_WORK_FILES
[19:46:22] CoreStatus = 74 (116)
[19:46:22] The core could not find the work files specified. Removing from queue
[19:46:22] Deleting current work unit & continuing...
[19:46:29] - Preparing to get new work unit...
[19:46:29] + Attempting to get work packet
[19:46:29] - Connecting to assignment server

The results didn't upload though as shown above - but the file :
wuresults_08.dat (627KB)
is still in the work folder- but no other' 08' files from that run.

Extract from data showing unit 08 finished OK ...

(Mnbf/s) (MFlops) (ps/NODE hour) (NODE hour/ns)
Performance: 19.670 755.359 70.571 14.170
Finished mdrun on node 0 Sun Feb 03 15:58:53 2008

Is this data file valid- and if so can it be sent in any way ?

I suspect not !! & maybe that unit has already been reallocated...
- but many people will have similar glitches on the way..and it's good to see some feedback !!
Image
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby bruce » Mon Feb 04, 2008 8:21 pm

new08 wrote:Is this data file valid- and if so can it be sent in any way ?

I suspect not !! & maybe that unit has already been reallocated...
- but many people will have similar glitches on the way..and it's good to see some feedback !!


Even if the WU has already been reassigned, it may be possible to get credit for it by using "qfix" from a DOS window in the folding directory. I don't know if it's valid or not but the Pande Group will decide if you can upload it.
bruce
 
Posts: 21273
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding 4 Lost Data

Postby new08 » Sun Feb 10, 2008 1:22 pm

using "qfix" from a DOS window

I tried that but it flashed by too quickly.Window closed immediately.
I assume it needs to posted somehow,email doesn't seem available on either site and as you say Pande would need to comment first as to if data any good.
It seems a lot of file but I notice archive files, plus others, are in a typical finished session batch...so maybe it's best to forget it...Like the display issue !
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby toTOW » Mon Feb 11, 2008 11:53 am

Run qfix from a Command Line Window ... you'll be able to see what it displays.
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8914
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Folding 4 Lost Data

Postby new08 » Mon Feb 11, 2008 1:03 pm

Ok ...That shows: Can't open <queue.dat> file.
Presumably as it is now running another unit with different associated files.
The data in the one remaining file seems substantial..but probably needs those other, missing ,files to run with qfix.
I think that old ,hard won, unit will have to drift into the long grass !!
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby toTOW » Tue Feb 12, 2008 1:14 am

Did you put qfix.exe in the same folder has your folding client :?:
User avatar
toTOW
Site Moderator
 
Posts: 8914
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Folding 4 Lost Data

Postby new08 » Tue Feb 12, 2008 10:41 am

OK tow..Thnx..Here's result
C:\Program Files\Folding@Home>qfix
entry 4, status 0, address 171.64.122.139:8080
entry 5, status 0, address 130.49.240.81:8080
entry 6, status 0, address 171.64.65.58:8080
entry 7, status 0, address 171.64.65.58:8080
entry 8, status 2, address 171.64.65.102:8080
Found results <work\wuresults_08.dat>: proj 3408, run 3, clone 708, gen 2
-- queue entry: proj 3408, run 3, clone 708, gen 2
-- increasing packet limit from 0 to 1642360
entry 9, status 1642360, address 171.64.122.70:8080
entry 0, status 0, address 171.64.122.70:8080
entry 1, status 0, address 130.49.240.81:8080
entry 2, status 0, address 130.49.240.81:8080
entry 3, status 1, address 171.64.122.70:8080
File needed repair. Errors fixed: 1.

I'll leave it like that and see if it goes off with the next unit!!
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby toTOW » Tue Feb 12, 2008 11:39 am

It recovered your result files. You can force it to sent by restarting the client, or you can wait for the automatic send to do its job.
User avatar
toTOW
Site Moderator
 
Posts: 8914
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Folding 4 Lost Data

Postby new08 » Tue Feb 12, 2008 3:13 pm

Thanks Tow..good result ,after all ,then !!
I hope what must be a regular query stays up for others here...
I gather the old Win98 Ver 4 queries were removed in the forum upgrade recently.
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby new08 » Wed Feb 13, 2008 11:16 am

Final position on this query- 388 points allocated during running of current unit..so worth the effort!
Cheers !
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby new08 » Tue May 27, 2008 3:28 pm

Something of a re-run of the earlier problem.
Qfix seems to repair the file but it subsequently fails to upload.
The unit 5 is still waiting , unit 6 is amended (deletes 2127692) at the time of re- attempt -and current unit is now 8.
After a new re-send failure the queue file seems to need repair again and the sequence repeats itself.
I've tried a number of restarts but result always the same.
Is this corrupted data -or a server issue - or both ??
Here's the data on qfix and the re-send attempt.

Code: Select all
entry 5, status 2, address 171.65.103.157:80
  Found results <work\wuresults_05.dat>: proj 2582, run 4, clone 1, gen 11
   -- queue entry: proj 2582, run 4, clone 1, gen 11
   -- increasing packet limit from 0 to 2127692
entry 6, status 2127692, address 171.64.122.139:8080
entry 7, status 1, address 171.64.122.72:8080
File needed repair.  Errors fixed: 1.

C:\Program Files\Folding@Home>qfix
entry 8, status 0, address 169.230.26.30:8080
entry 9, status 0, address 130.49.240.81:8080
entry 0, status 0, address 130.49.240.81:8080
entry 1, status 0, address 130.49.240.81:8080
entry 2, status 0, address 169.230.26.30:8080
entry 3, status 0, address 169.230.26.30:8080
entry 4, status 0, address 130.49.240.81:8080
entry 5, status 2, address 171.65.103.157:80
  Found results <work\wuresults_05.dat>: proj 2582, run 4, clone 1, gen 11
   -- queue entry: proj 2582, run 4, clone 1, gen 11
   -- already queued for upload
entry 6, status 2127692, address 171.64.122.139:8080
entry 7, status 1, address 171.64.122.72:8080
File is OK


[08:51:29] + Attempting to send results
[08:51:30]
[08:51:30] *------------------------------*
[08:51:30] Folding@Home Gromacs Simulated Tempering Core
[08:51:30] Version 1.10 (Oct 4, 2007)
[08:51:30]
[08:51:30] Preparing to commence simulation
[08:51:30] - Looking at optimizations...
[08:51:30] - Files status OK
[08:51:30] - Expanded 239441 -> 1169207 (decompressed 488.3 percent)
[08:51:30]
[08:51:30] Project: 4401 (Run 1, Clone 7, Gen 196)
[08:51:30]
[08:51:30] Assembly optimizations on if available.
[08:51:30] Entering M.D.
[08:51:33] Couldn't send HTTP request to server (wininet)
[08:51:33] + Could not connect to Work Server (results)
[08:51:33]     (171.65.103.157:80)
[08:51:33] - Error: Could not transmit unit 05 (completed May 27). Keeping unit in queue.
[08:51:33] + Working...
[08:51:50] (Starting from checkpoint)
[08:51:50] Protein: p4401_Seq_41_unf_AMBER
[08:51:50]
[08:51:50] Writing local files
[08:54:08] Extra SSE boost OK.
[08:54:08] Writing local files
[08:54:08] Completed 0 out of 150000 steps  (0)
[08:56:27] Writing local files
[08:56:27] Completed 1500 out of 150000 steps  (1)
[08:58:45] Writing local files
[08:58:45] Completed 3000 out of 150000 steps  (2)

Unit still running ok.
Any ideas as to what's happened - and whether it will self correct or cause further data loss?
Last edited by new08 on Fri Oct 29, 2010 11:48 am, edited 1 time in total.
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England

Re: Folding 4 Lost Data

Postby bruce » Tue May 27, 2008 7:31 pm

If you look at http://fah-web.stanford.edu/serverstat.html you'll see that server 171.65.103.157 is currently rejecting connections so nothing can be uploaded to it until somebody fixes the server. All you can do is wait.

I'll notifiy the researcher who owns this project and server.
bruce
 
Posts: 21273
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding 4 Lost Data

Postby new08 » Wed May 28, 2008 10:28 am

Thanks Bruce ..that was the case- and data went through as soon as server involved was back.
Looked like a few days back up was there- so it was just as well I flagged it..though now I will check the servers if it happens again !
I was misled by the recycling error, on using qfix, ...but of course that doesn't include 'server- fix' in it's remit ! :)
User avatar
new08
 
Posts: 342
Joined: Fri Jan 04, 2008 11:02 pm
Location: England


Return to Older Windows (98,ME only) client v4.x

Who is online

Users browsing this forum: No registered users and 1 guest

cron