Stanford Network Issue {Resolved}

Moderators: Site Moderators, PandeGroup

Stanford Network Issue {Resolved}

Postby PantherX » Sun Dec 11, 2011 6:51 am

Please note that several F@H servers are down due to technical reasons. For details, please read this -> http://folding.typepad.com/news/2011/12 ... ord-1.html
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Chrome Folding App (Beta) Ӂ Troubleshooting "Bad WUs" Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Super Moderator
 
Posts: 6273
Joined: Wed Dec 23, 2009 9:33 am

Re: Stanford Network Issue

Postby tear » Sun Dec 11, 2011 11:03 am

Stats haven't seen updates for 6+ hours too.
One man's ceiling is another man's floor.
Image
tear
 
Posts: 924
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Stanford Network Issue

Postby PantherX » Sun Dec 11, 2011 2:39 pm

Here is an update:
UPDATE 4:30am Pacific Time: Chilled water came back on line at 11am, but several of our servers are still down. Our sysadmins will work to get them back up, but it may not be until Monday, depending on their availability on Sunday.
User avatar
PantherX
Super Moderator
 
Posts: 6273
Joined: Wed Dec 23, 2009 9:33 am

Re: Stanford Network Issue

Postby Ordinant » Sun Dec 11, 2011 2:58 pm

Sunday 11Dec11 is turning out to be a good day to do some long-postponed maintenance on several of my folding machines.
Ordinant
 
Posts: 4
Joined: Wed Oct 19, 2011 3:01 am

Re: Stanford Network Issue

Postby mattozan » Sun Dec 11, 2011 6:12 pm

I do hope work done in the last 12 hours was received and will count. My rig did continue receiving, crunching and submitting WUs all night. I assume that means that enough infrastructure was still running to maintain client communication and queue completed work.
mattozan
 
Posts: 6
Joined: Tue Dec 07, 2010 7:59 pm

Re: Stanford Network Issue

Postby Joe_H » Sun Dec 11, 2011 6:52 pm

Well, it looks like some of the machines are back up from what serverstats is showing. As for whether work was received, that would depend on which work server it came from. For instance I have a Project 6026 on one machine that finished just after this problem started yesterday. It is waiting to upload, and should later today since its work server is back up. It could not go to a collection server, the one for this project is not functioning. But some other projects do have a functioning collection server if the work server is not available. For other people it will all depend on their specific work load, but eventually most if not all should get collected and credited.

As for continuing work, once the Project 60nn server went offline, my machines started downloading from a server elsewhere at Stanford and getting Project 8001 WU's and being able to return them. But I will have to wait on the stats servers coming back later to see the points awarded. For others if their configuration only was eligible for WU's from the effected servers, then they just stopped folding for the night if they needed a new WU.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook 2.4 C2D 4 GB smp2, PowerMac G4 dual 1.73 1.5 GB (ret.)
Joe_H
Site Moderator
 
Posts: 1894
Joined: Tue Apr 21, 2009 4:41 pm

Re: Stanford Network Issue

Postby Napoleon » Sun Dec 11, 2011 9:11 pm

It seems I haven't received credit for these three classic WUs (server 171.67.108.53):

Napoleon, team _Solo_ (191980)
project:6892 run:262 clone:8 gen:64
project:6892 run:710 clone:7 gen:45
project:6892 run:158 clone:9 gen:44
Code: Select all
*snip*: Date 2011-12-11
07:38:12:Sending unit results: id:05 state:SEND error:OK project:6892 run:262 clone:8 gen:64 core:0x78 unit:0x000000486652edc54e25cc4a185a1f21
07:38:12:Unit 05: Uploading 1.01MiB to 171.67.108.53
07:38:12:Connecting to 171.67.108.53:8080
07:38:22:Unit 05: Upload complete
07:38:22:Server responded WORK_ACK (400)
07:38:22:Final credit estimate, 136.00 points
07:38:22:Cleaning up Unit 05

*snip: Date 2011-12-11
09:33:27:Sending unit results: id:06 state:SEND error:OK project:6892 run:710 clone:7 gen:45 core:0x78 unit:0x000000366652edc54e25d4d8ce114bb2
09:33:27:Unit 06: Uploading 1.01MiB to 171.67.108.53
09:33:27:Connecting to 171.67.108.53:8080
09:33:36:Unit 06: Upload complete
09:33:36:Server responded WORK_ACK (400)
09:33:36:Final credit estimate, 136.00 points
09:33:36:Cleaning up Unit 06

*snip*: Date 2011-12-11
13:11:20:Sending unit results: id:01 state:SEND error:OK project:6892 run:158 clone:9 gen:44 core:0x78 unit:0x0000002e6652edc54e25ca53ffa1dfc7
13:11:20:Unit 01: Uploading 995.87KiB to 171.67.108.53
13:11:20:Connecting to 171.67.108.53:8080
13:11:29:Unit 01: Upload complete
13:11:29:Server responded WORK_ACK (400)
13:11:29:Final credit estimate, 136.00 points
13:11:29:Cleaning up Unit 01


For comparison purposes - a GPU WU did receive credit (server 171.64.65.105): :biggrin:

Zotac430, team _Solo_ (191980)
project:7622 run:281 clone:0 gen:2
Code: Select all
*snip*: Date 2011-12-11
00:21:11:Sending unit results: id:00 state:SEND error:OK project:7622 run:281 clone:0 gen:2 core:0x15 unit:0x00000002664f2dd14edd583af4a6ea08
00:21:11:Unit 00: Uploading 804.87KiB to 171.64.65.105
00:21:11:Connecting to 171.64.65.105:8080
00:21:22:Unit 00: Upload complete
00:21:22:Server responded WORK_ACK (400)
00:21:22:Final credit estimate, 5187.00 points
00:21:22:Cleaning up Unit 00
Win7 64bit, FAH v7, OC'd
2C/4T Atom330 3x685MHz - GT430 2x810MHz - ION1 9400M 3x466.7MHz
4x OGR27 - Core_15 - display & Core_11

Mint 14 64bit, FAH v7 & v6, at stock
2C/2T E2220 - 9400GT
cpu:2 Core_a? - display @ Core_11 (v6 WINE)
User avatar
Napoleon
 
Posts: 1080
Joined: Wed May 26, 2010 2:31 pm
Location: Finland

Re: Stanford Network Issue

Postby mattozan » Sun Dec 11, 2011 9:56 pm

I haven't seen any credit since prior to the server trouble. But my GPUs continue to receive, crunch and submit work. Here's the edited log of one of my GPUs.

You can see where it ran into trouble trying to upload a completed WU at 02:30 to 171.67.108.26

But then it uploaded OK with the next completed WU three hrs later. After that things have seemed to go OK.

But my "Date of last work unit" is still stuck on "2011-12-10 16:02:18"

Code: Select all
[23:40:47] + Attempting to send results [December 10 23:40:47 UTC]
[23:40:47] Gpu type=3 species=21.
[23:40:48] + Results successfully sent
[23:40:48] Thank you for your contribution to Folding@Home.
[23:40:48] + Number of Units Completed: 99
[23:40:52] - Preparing to get new work unit...
[23:40:52] Cleaning up work directory
[23:40:52] + Attempting to get work packet
[23:40:52] Passkey found
[23:40:52] Gpu type=3 species=21.
[23:40:52] - Connecting to assignment server
[23:40:52] - Successful: assigned to (171.64.65.64).


[02:30:19] + Attempting to send results [December 11 02:30:19 UTC]
[02:30:19] Gpu type=3 species=21.
[02:30:20] - Couldn't send HTTP request to server
[02:30:20] + Could not connect to Work Server (results)
[02:30:20]     (171.67.108.26:8080)
[02:30:20] + Retrying using alternative port
[02:30:21] - Couldn't send HTTP request to server
[02:30:21] + Could not connect to Work Server (results)
[02:30:21]     (171.67.108.26:80)
[02:30:21]   Could not transmit unit 00 to Collection server; keeping in queue.
[02:30:21] - Preparing to get new work unit...
[02:30:21] Cleaning up work directory
[02:30:21] + Attempting to get work packet
[02:30:21] Passkey found
[02:30:21] Gpu type=3 species=21.
[02:30:21] - Connecting to assignment server
[02:30:21] - Successful: assigned to (171.67.108.32).


[04:32:22] + Attempting to send results [December 11 04:32:22 UTC]
[04:32:22] Gpu type=3 species=21.
[04:32:23] + Results successfully sent
[04:32:23] Thank you for your contribution to Folding@Home.
[04:32:23] + Number of Units Completed: 100
[04:32:27] Project: 6800 (Run 17638, Clone 0, Gen 938)
[04:32:27] - Read packet limit of 540015616... Set to 524286976.


[04:32:27] + Attempting to send results [December 11 04:32:27 UTC]
[04:32:27] Gpu type=3 species=21.
[04:32:28] + Results successfully sent
[04:32:28] Thank you for your contribution to Folding@Home.
[04:32:28] + Number of Units Completed: 101
[04:32:28] - Preparing to get new work unit...
[04:32:28] Cleaning up work directory
[04:32:28] + Attempting to get work packet
[04:32:28] Passkey found
[04:32:28] Gpu type=3 species=21.
[04:32:28] - Connecting to assignment server
[04:32:28] - Successful: assigned to (171.64.65.64).



[07:20:10] + Attempting to send results [December 11 07:20:10 UTC]
[07:20:10] Gpu type=3 species=21.
[07:20:10] + Results successfully sent
[07:20:10] Thank you for your contribution to Folding@Home.
[07:20:10] + Number of Units Completed: 102
[07:20:14] - Preparing to get new work unit...
[07:20:14] Cleaning up work directory
[07:20:14] + Attempting to get work packet
[07:20:14] Passkey found
[07:20:14] Gpu type=3 species=21.
[07:20:14] - Connecting to assignment server
[07:20:14] - Successful: assigned to (171.67.108.54).



[10:07:53] + Attempting to send results [December 11 10:07:53 UTC]
[10:07:53] Gpu type=3 species=21.
[10:07:53] + Results successfully sent
[10:07:53] Thank you for your contribution to Folding@Home.
[10:07:53] + Number of Units Completed: 103
[10:07:57] - Preparing to get new work unit...
[10:07:57] Cleaning up work directory
[10:07:57] + Attempting to get work packet
[10:07:57] Passkey found
[10:07:57] Gpu type=3 species=21.
[10:07:57] - Connecting to assignment server
[10:07:57] - Successful: assigned to (171.67.108.54).



[12:55:24] + Attempting to send results [December 11 12:55:24 UTC]
[12:55:24] Gpu type=3 species=21.
[12:55:25] + Results successfully sent
[12:55:25] Thank you for your contribution to Folding@Home.
[12:55:25] + Number of Units Completed: 104
[12:55:29] - Preparing to get new work unit...
[12:55:29] Cleaning up work directory
[12:55:29] + Attempting to get work packet
[12:55:29] Passkey found
[12:55:29] Gpu type=3 species=21.
[12:55:29] - Connecting to assignment server
[12:55:29] - Successful: assigned to (171.64.65.64).



[15:43:00] + Attempting to send results [December 11 15:43:00 UTC]
[15:43:00] Gpu type=3 species=21.
[15:43:00] + Results successfully sent
[15:43:00] Thank you for your contribution to Folding@Home.
[15:43:00] + Number of Units Completed: 105
[15:43:04] - Preparing to get new work unit...
[15:43:04] Cleaning up work directory
[15:43:04] + Attempting to get work packet
[15:43:04] Passkey found
[15:43:04] Gpu type=3 species=21.
[15:43:04] - Connecting to assignment server
[15:43:04] - Successful: assigned to (171.67.108.54).



[18:30:33] + Attempting to send results [December 11 18:30:33 UTC]
[18:30:33] Gpu type=3 species=21.
[18:30:33] + Results successfully sent
[18:30:33] Thank you for your contribution to Folding@Home.
[18:30:33] + Number of Units Completed: 106
[18:30:38] - Preparing to get new work unit...
[18:30:38] Cleaning up work directory
[18:30:38] + Attempting to get work packet
[18:30:38] Passkey found
[18:30:38] Gpu type=3 species=21.
[18:30:38] - Connecting to assignment server
[18:30:38] - Successful: assigned to (171.64.65.64).



[21:18:56] + Attempting to send results [December 11 21:18:56 UTC]
[21:18:56] Gpu type=3 species=21.
[21:19:05] + Results successfully sent
[21:19:05] Thank you for your contribution to Folding@Home.
[21:19:05] + Number of Units Completed: 107
[21:19:09] - Preparing to get new work unit...
[21:19:09] Cleaning up work directory
[21:19:09] + Attempting to get work packet
[21:19:09] Passkey found
[21:19:09] Gpu type=3 species=21.
[21:19:09] - Connecting to assignment server
[21:19:09] - Successful: assigned to (171.67.108.54).
mattozan
 
Posts: 6
Joined: Tue Dec 07, 2010 7:59 pm

Re: Stanford Network Issue

Postby PantherX » Mon Dec 12, 2011 5:34 am

Another update:
UPDATE 11:30am Pacific time: Our sysadmins have been in the office getting machines back on line. We're almost there, although it looks like there are a few machines which have issues resulting from the outage.
User avatar
PantherX
Super Moderator
 
Posts: 6273
Joined: Wed Dec 23, 2009 9:33 am

Re: Stanford Network Issue

Postby joe53 » Mon Dec 12, 2011 6:02 am

You know, I'm sanguine about personal WUs and PPD not being credited.

As long as I know the work is being done, and is useful.

My client says the work continues to be successfully sent, but Sanford's site says nothing has been received for about the past 36 hours. I hope these 5 or 6 work units have not been irretrievedly wasted.
joe53
 
Posts: 33
Joined: Wed Aug 06, 2008 9:20 pm

Re: Stanford Network Issue

Postby mattozan » Mon Dec 12, 2011 6:14 am

joe53 wrote:You know, I'm sanguine about personal WUs and PPD not being credited.

As long as I know the work is being done, and is useful.

My client says the work continues to be successfully sent, but Sanford's site says nothing has been received for about the past 36 hours. I hope these 5 or 6 work units have not been irretrievedly wasted.


Yeah, I'm of the same mind. Points aren't the real thing. But I hope the lack of points updates for me doesn't also mean that the science data is somehow going to dev/null

"Date of last work unit 2011-12-10 16:02:18"
Last edited by mattozan on Mon Dec 12, 2011 6:25 am, edited 1 time in total.
mattozan
 
Posts: 6
Joined: Tue Dec 07, 2010 7:59 pm

Re: Stanford Network Issue

Postby Jesse_V » Mon Dec 12, 2011 6:22 am

joe53 wrote:You know, I'm sanguine about personal WUs and PPD not being credited.

As long as I know the work is being done, and is useful.

My client says the work continues to be successfully sent, but Sanford's site says nothing has been received for about the past 36 hours. I hope these 5 or 6 work units have not been irretrievedly wasted.


I doubt it. Your WUs are probably doing fine. The servers that update the stats must be having problems. Here's how I'm guessing things are set up: you upload your WUs to server A. (I'm calling the set of servers A) Then server B periodically checks server A and updates the stats, which server C retrieves when you check online. You go online and get your stats from server C. Now, I'm not sure if B and C are the same (they could be since you can't see personal stats during updates) but if this configuration is correct then server B is down, so the info doesn't go all the way through. If B and C are indeed the same, then it could have one of those "issues resulting from the outage".
User avatar
Jesse_V
 
Posts: 2752
Joined: Mon Jul 18, 2011 4:44 am
Location: Logan, Utah, USA

Re: Stanford Network Issue

Postby Dinkydau » Mon Dec 12, 2011 10:45 am

Yes, it gives the idea to be doing work for nothing.
Dinkydau
 
Posts: 15
Joined: Sat Jun 04, 2011 10:55 am

Re: Stanford Network Issue

Postby War » Mon Dec 12, 2011 11:20 am

I have chilled water, send the servers to me
War
Stabellsveg 9a
7021 Trondheim
Norway

Also have a 70/10 Internet, its crap but it works.
War
 
Posts: 21
Joined: Thu Feb 05, 2009 4:37 am

Re: Stanford Network Issue

Postby kromberg » Mon Dec 12, 2011 3:30 pm

Any news or status on the server outage? This has been one of the longer ones I have seen.
kromberg
 
Posts: 88
Joined: Sat Nov 07, 2009 4:36 pm

Next

Return to Discussions of General-FAH topics

Who is online

Users browsing this forum: No registered users and 0 guests