171.64.122.136 no response for 24hrs

Moderators: Site Moderators, FAHC Science Team

Post Reply
pete2020
Posts: 5
Joined: Tue Nov 11, 2008 3:06 pm

171.64.122.136 no response for 24hrs

Post by pete2020 »

Hi,
I looked at Bruces note but can't connect either through F@H or direct with my browser. My firewall seems to be OK. My PC is continuing to simulate.

11:23:19] Assembly optimizations on if available.
[11:23:19] Entering M.D.
[11:23:40] - Couldn't send HTTP request to server
[11:23:40] + Could not connect to Work Server (results)
[11:23:40] (171.64.122.136:8080)
[11:23:40] + Retrying using alternative port
[11:24:01] - Couldn't send HTTP request to server
[11:24:01] + Could not connect to Work Server (results)
[11:24:01] (171.64.122.136:80)
[11:24:01] - Error: Could not transmit unit 01 (completed November 12) to work server.


[11:24:01] + Attempting to send results [November 16 11:24:01 UTC]
[11:24:10] - Server does not have record of this unit. Will try again later.
[11:24:10] Could not transmit unit 01 to Collection server; keeping in queue.
[11:24:52] Protein: p896_p53longpeptides_GB
[11:24:52]
[11:24:52] Completed 800000 out of 5000000 steps (16%)
[11:38:19] Writing checkpoint files
[11:41:22] Writing local files
daveb

Re: 171.64.122.136 no response for 24hrs

Post by daveb »

One of my machines has been trying to return p2527 (Run 98, Clone 72, Gen 2) to 171.64.122.136 since Thursday evening. The server status page shows the server as DOWN, and the logs indicate this has been the case since about 4:20 PM PST on Thursday. The server being down is probably also why attempts to upload to a collection server (171.67.108.17) return the message "Server does not have a record of this unit. Will try again later."

Dave
truder44
Posts: 16
Joined: Sun Mar 30, 2008 5:09 pm

Re: 171.64.122.136 no response for 24hrs

Post by truder44 »

Server 171.64.122.136
I am also having the same problem with this server returning results. 18 failed uploads so far. (All other WU completed during this time have uploaded results successfully)
[11:14:16] Folding@home Core Shutdown: FINISHED_UNIT
[11:14:19] CoreStatus = 64 (100)
[11:14:19] Unit 1 finished with 96 percent of time to deadline remaining.
[11:14:19] Updated performance fraction: 0.964391
[11:14:19] Sending work to server
[11:14:19] Project: 2527 (Run 60, Clone 51, Gen 2)


[11:14:19] + Attempting to send results [November 13 11:14:19 UTC]
[11:14:19] - Reading file work/wuresults_01.dat from core
[11:14:19] (Read 347906 bytes from disk)
[11:14:19] Connecting to http://171.64.122.136:8080/
[11:14:40] - Couldn't send HTTP request to server
[11:14:40] + Could not connect to Work Server (results)
[11:14:40] (171.64.122.136:8080)
[11:14:40] + Retrying using alternative port
[11:14:40] Connecting to http://171.64.122.136:80/
[11:15:01] - Couldn't send HTTP request to server
[11:15:01] + Could not connect to Work Server (results)
[11:15:01] (171.64.122.136:80)
[11:15:01] - Error: Could not transmit unit 01 (completed November 13) to work server.
[11:15:01] - 1 failed uploads of this unit.
[11:15:01] Keeping unit 01 in queue.
[11:15:01] Trying to send all finished work units
[11:15:01] Project: 2527 (Run 60, Clone 51, Gen 2)

~ ~ ~

[02:47:17] Trying to send all finished work units
[02:47:17] Project: 2527 (Run 60, Clone 51, Gen 2)


[02:47:17] + Attempting to send results [November 17 02:47:17 UTC]
[02:47:17] - Reading file work/wuresults_01.dat from core
[02:47:17] (Read 347906 bytes from disk)
[02:47:17] Connecting to http://171.64.122.136:8080/
[02:47:38] - Couldn't send HTTP request to server
[02:47:38] + Could not connect to Work Server (results)
[02:47:38] (171.64.122.136:8080)
[02:47:38] + Retrying using alternative port
[02:47:38] Connecting to http://171.64.122.136:80/
[02:47:59] - Couldn't send HTTP request to server
[02:47:59] + Could not connect to Work Server (results)
[02:47:59] (171.64.122.136:80)
[02:47:59] - Error: Could not transmit unit 01 (completed November 13) to work server.
[02:47:59] - 18 failed uploads of this unit.

ppetrone
Pande Group Member
Posts: 115
Joined: Wed Dec 12, 2007 6:20 pm
Location: Stanford
Contact:

Re: 171.64.122.136 no response for 24hrs

Post by ppetrone »

yes. The server is down. Let me make some alerts and/or try to fix it asap. Thanks and sorry for the hassle.

Paula
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: 171.64.122.136 no response for 24hrs

Post by VijayPande »

This machine is physically down and it is late Sunday night, so our admin staff isn't in the office right now. They'll check it out Monday morning.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
truder44
Posts: 16
Joined: Sun Mar 30, 2008 5:09 pm

Re: 171.64.122.136 no response for 24hrs

Post by truder44 »

truder44 wrote:Server 171.64.122.136
I am also having the same problem with this server returning results. 18 failed uploads so far. (All other WU completed during this time have uploaded results successfully)
I am now up to 31 failed uploads of my WU on this server and it is showing as being in REJECT mode.
pete2020
Posts: 5
Joined: Tue Nov 11, 2008 3:06 pm

Re: 171.64.122.136 no response for 24hrs

Post by pete2020 »

Since Nov 12th no upload.

08:17:18] Project: 896 (Run 10, Clone 246, Gen 32)
[08:17:18]
[08:17:18] Assembly optimizations on if available.
[08:17:18] Entering M.D.
[08:17:20] - Couldn't send HTTP request to server
[08:17:20] + Could not connect to Work Server (results)
[08:17:20] (171.64.122.136:8080)
[08:17:20] + Retrying using alternative port
[08:17:22] - Couldn't send HTTP request to server
[08:17:22] + Could not connect to Work Server (results)
[08:17:22] (171.64.122.136:80)
[08:17:22] - Error: Could not transmit unit 01 (completed November 12) to work server.



[08:17:22] + Attempting to send results [November 19 08:17:22 UTC]
[08:17:25] Protein: p896_p53longpeptides_GB
[08:17:25]
[08:17:25] Completed 3150000 out of 5000000 steps (63%)
[08:17:31] - Server does not have record of this unit. Will try again later.
[08:17:31] Could not transmit unit 01 to Collection server; keeping in queue.
[08:32:18] Writing checkpoint files
ed
Pande Group Member
Posts: 13
Joined: Sun Dec 02, 2007 6:37 pm

Re: 171.64.122.136 no response for 24hrs

Post by ed »

Hi guys,

of course that machine has to go down while I was out of town. And it looks like after the restart the machine only listened on port 8080 so I fixed that right now and hope this resolved all issues.

Sorry for that,
Ed
Post Reply