Project: 6901 (Run 4, Clone 0, Gen 264)

Moderators: Site Moderators, FAHC Science Team

Post Reply
bollix47
Posts: 2942
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Project: 6901 (Run 4, Clone 0, Gen 264)

Post by bollix47 »

FYI

Another 512 byte WU.

Code: Select all

[13:11:55] + Processing work unit
[13:11:55] Core required: FahCore_a5.exe
[13:11:55] Core found.
[13:11:55] Working on queue slot 04 [April 30 13:11:55 UTC]
[13:11:55] + Working ...
[13:11:55] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 5837 -version 634'

[13:11:55] 
[13:11:55] *------------------------------*
[13:11:55] Folding@Home Gromacs SMP Core
[13:11:55] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[13:11:55] 
[13:11:55] Preparing to commence simulation
[13:11:55] - Looking at optimizations...
[13:11:55] - Created dyn
[13:11:55] - Files status OK
[13:11:55] Couldn't Decompress
[13:11:55] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[13:11:55] -Error: Couldn't update checksum variables
[13:11:55] Error: Could not open work file
[13:11:55] 
[13:11:55] Folding@home Core Shutdown: FILE_IO_ERROR
[13:11:55] CoreStatus = 75 (117)
[13:11:55] Error opening or reading from a file.
[13:11:55] Deleting current work unit & continuing...
[13:11:55] Trying to send all finished work units
[13:11:55] + No unsent completed units remaining.
[13:11:55] - Preparing to get new work unit...
[13:11:55] Cleaning up work directory
[13:11:55] + Attempting to get work packet
[13:11:55] Passkey found
[13:11:55] - Will indicate memory of 32233 MB
[13:11:55] - Connecting to assignment server
[13:11:55] Connecting to http://assign.stanford.edu:8080/
[13:11:55] Posted data.
[13:11:55] Initial: ED82; - Successful: assigned to (130.237.232.237).
[13:11:55] + News From Folding@Home: Welcome to Folding@Home
[13:11:55] Loaded queue successfully.
[13:11:55] Sent data
[13:11:55] Connecting to http://130.237.232.237:8080/
[13:11:56] Posted data.
[13:11:56] Initial: 0000; - Receiving payload (expected size: 512)
[13:11:56] Conversation time very short, giving reduced weight in bandwidth avg
[13:11:56] - Downloaded at ~1 kB/s
[13:11:56] - Averaged speed for that direction ~12 kB/s
[13:11:56] + Received work.
[13:11:56] + Closed connections
[13:12:01] 
[13:12:01] + Processing work unit
[13:12:01] Core required: FahCore_a5.exe
[13:12:01] Core found.
[13:12:01] Working on queue slot 05 [April 30 13:12:01 UTC]
[13:12:01] + Working ...
[13:12:01] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 05 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 5837 -version 634'

[13:12:01] 
[13:12:01] *------------------------------*
[13:12:01] Folding@Home Gromacs SMP Core
[13:12:01] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[13:12:01] 
[13:12:01] Preparing to commence simulation
[13:12:01] - Looking at optimizations...
[13:12:01] - Created dyn
[13:12:01] - Files status OK
[13:12:01] Couldn't Decompress
[13:12:01] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[13:12:01] -Error: Couldn't update checksum variables
[13:12:01] Error: Could not open work file
[13:12:01] 
[13:12:01] Folding@home Core Shutdown: FILE_IO_ERROR
[13:12:01] CoreStatus = 75 (117)
[13:12:01] Error opening or reading from a file.
[13:12:01] Deleting current work unit & continuing...
[13:12:01] Trying to send all finished work units
[13:12:01] + No unsent completed units remaining.
[13:12:01] - Preparing to get new work unit...
[13:12:01] Cleaning up work directory
[13:12:01] + Attempting to get work packet
[13:12:01] Passkey found
[13:12:01] - Will indicate memory of 32233 MB


ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Project: 6901 (Run 4, Clone 0, Gen 264)

Post by ChelseaOilman »

Both of my 4P servers ran into the same issue last night. It's getting annoying.
bollix47
Posts: 2942
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 6901 (Run 4, Clone 0, Gen 264)

Post by bollix47 »

Yes, it is.

My log only shows one instance but in fact that instance was repeated over and over(more than 50 times before I stopped it) and the server/client didn't appear to want to move on. :e?:
ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Project: 6901 (Run 4, Clone 0, Gen 264)

Post by ChelseaOilman »

I had 1 client hammering the bigadv server about a half hour and the other client for an hour and a half before getting p6900 WUs on each. On my connection it takes about 6 seconds to cycle through each of these bad WUs, so about 600 times per hour. That can't be good for the server load on the bigadv server.

Code: Select all

[14:00:11] Connecting to http://130.237.232.237:8080/
[14:00:11] Posted data.
[14:00:11] Initial: 0000; - Receiving payload (expected size: 512)
[14:00:11] Conversation time very short, giving reduced weight in bandwidth avg
[14:00:11] - Downloaded at ~1 kB/s
[14:00:11] - Averaged speed for that direction ~47 kB/s
[14:00:11] + Received work.
[14:00:11] + Closed connections
[14:00:16] 
[14:00:16] + Processing work unit
[14:00:16] Core required: FahCore_a5.exe
[14:00:16] Core found.
[14:00:16] Working on queue slot 04 [April 30 14:00:16 UTC]
[14:00:16] + Working ...
[14:00:16] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 04 -np 48 -checkpoint 15 -forceasm -verbose -lifeline 2559 -version 634'

[14:00:16] 
[14:00:16] *------------------------------*
[14:00:16] Folding@Home Gromacs SMP Core
[14:00:16] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[14:00:16] 
[14:00:16] Preparing to commence simulation
[14:00:16] - Assembly optimizations manually forced on.
[14:00:16] - Not checking prior termination.
[14:00:16] Couldn't Decompress
[14:00:16] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[14:00:16] -Error: Couldn't update checksum variables
[14:00:16] Error: Could not open work file
[14:00:16] 
[14:00:16] Folding@home Core Shutdown: FILE_IO_ERROR
[14:00:17] CoreStatus = 75 (117)
[14:00:17] Error opening or reading from a file.
[14:00:17] Deleting current work unit & continuing...
[14:00:17] Trying to send all finished work units
[14:00:17] + No unsent completed units remaining.
[14:00:17] - Preparing to get new work unit...
[14:00:17] Cleaning up work directory
[14:00:17] + Attempting to get work packet
[14:00:17] Passkey found
[14:00:17] - Will indicate memory of 64553 MB
[14:00:17] - Connecting to assignment server
[14:00:17] Connecting to http://assign.stanford.edu:8080/
[14:00:17] Posted data.
[14:00:17] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:00:17] + News From Folding@Home: Welcome to Folding@Home
[14:00:17] Loaded queue successfully.
[14:00:17] Sent data
[14:00:17] Connecting to http://130.237.232.237:8080/
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6901 (Run 4, Clone 0, Gen 264)

Post by bruce »

This pesky FILE_IO_ERROR problem hasn't been fully diagnosed yet. Do you have any idea if upgrading to V7 would fix the problem? I know several similar such issues are fixed in V7 but not necessarily this exact problem.
Post Reply