vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Moderators: Site Moderators, FAHC Science Team

toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by toTOW »

Is maintenance done on these servers ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
anandhanju
Posts: 526
Joined: Mon Dec 03, 2007 4:33 am
Location: Australia

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by anandhanju »

mikesmusic is still unable to send queued results: viewtopic.php?p=66220#p66220 Can anyone please check if the server is fine?
anko1
Posts: 438
Joined: Mon Dec 03, 2007 1:31 am
Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional
Location: SF Peninsula

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by anko1 »

171.64.122.72 is in reject now. Just lost two units: the 4419 that expired and the unit that got a special exit when the program tried to auto send the expired unit. Of the 6 WUs 4419-21 (or so, they're on different machines) that I've gotten, only one has been returned. I'm going to delete the others before they kill units in process too.
anko1
Posts: 438
Joined: Mon Dec 03, 2007 1:31 am
Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional
Location: SF Peninsula

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by anko1 »

I see that 171.64.122.72 is back up now, but it has a really high CPU load (6.86) and a heavy net load (171). Also the DL (days left on the tape?) is at zero, so don't know if that is affecting things too.
mikesmusic
Posts: 8
Joined: Tue Sep 09, 2008 8:22 pm

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by mikesmusic »

My two work units, project 4418, completed 30 oct and 1 nov still won't upload.
Neither (171.67.108.17:8080)nor(171.64.122.72:8080) will respond

Code: Select all

[06:44:48] + Attempting to send results
[06:45:06] - Couldn't send HTTP request to server
[06:45:06] + Could not connect to Work Server (results)
[06:45:06]     (171.67.108.17:8080)
[06:45:06]   Could not transmit unit 07 to Collection server; keeping in queue.
[08:05:44] Writing local files
[08:05:44] Completed 820000 out of 2000000 steps  (41)
[10:26:44] Writing local files
[10:26:44] Completed 840000 out of 2000000 steps  (42)


[12:45:10] + Attempting to send results
[12:47:43] Writing local files
[12:47:43] Completed 860000 out of 2000000 steps  (43)
[12:52:31] - Couldn't send HTTP request to server
[12:52:31] + Could not connect to Work Server (results)
[12:52:31]     (171.64.122.72:8080)
[12:52:31] - Error: Could not transmit unit 06 (completed October 30) to work server.


[12:52:31] + Attempting to send results
[12:52:32] - Couldn't send HTTP request to server
[12:52:32] + Could not connect to Work Server (results)
[12:52:32]     (171.67.108.17:8080)
[12:52:32]   Could not transmit unit 06 to Collection server; keeping in queue.


[12:52:32] + Attempting to send results
[12:59:54] - Couldn't send HTTP request to server
[12:59:54] + Could not connect to Work Server (results)
[12:59:54]     (171.64.122.72:8080)
[12:59:54] - Error: Could not transmit unit 07 (completed November 1) to work server.


[12:59:54] + Attempting to send results
[12:59:54] - Couldn't send HTTP request to server
[12:59:54] + Could not connect to Work Server (results)
[12:59:54]     (171.67.108.17:8080)
[12:59:54]   Could not transmit unit 07 to Collection server; keeping in queue.
[15:08:41] Writing local files
[15:08:41] Completed 880000 out of 2000000 steps  (44)
[17:31:16] Writing local files
[17:31:16] Completed 900000 out of 2000000 steps  (45)
anko1
Posts: 438
Joined: Mon Dec 03, 2007 1:31 am
Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional
Location: SF Peninsula

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by anko1 »

171.64.122.72 has a monstrously high net load of 445 and a CPU load of 3.42 with an assignment weight of 9%. Could we maybe turn off all assigning until units come home? I have some outstanding from October.
mikesmusic
Posts: 8
Joined: Tue Sep 09, 2008 8:22 pm

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by mikesmusic »

That server status page is beyond the ken of this mere mortal. Of the 30 or so servers supposedly accepting jobs for the 'classic' clients, only about two currently have "% Ass 80" in double figures: Good ol' vsp05 (171.64.122.72) and VSPMF93 (171.65.103.160). The net load of vsp05 is way beyond anything else. I cannot even ping vsp05 at present. i wonder how many more weeks this will go on for.
mikesmusic
Posts: 8
Joined: Tue Sep 09, 2008 8:22 pm

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by mikesmusic »

Anko where did you see assignment weight of 9%? I'm looking at the WEight column in the server stats page. Last time I looked I thought vsp05's weight was '10000' but today it is '5000' (ie less). One of my jobs actually was accepted in the last day or so. :D I have just one ( completed nov 1st) left now. Not that I'm any judge :egeek: but it looks like these work packets are a bit on the small side and are overloading the servers by coming back too quickly??
anko1
Posts: 438
Joined: Mon Dec 03, 2007 1:31 am
Hardware configuration: Old Faithful CPU: Windows Graphical 5.03; Intel Pentium 4 Processor 540
(3.2GHz) HT;Windows XP
Big Red: Windows SMP Console 6.29; Windows GPU console 6.20r1; Intel Q9450 2.66G; ASUS P5Q 775 P45; [BFG 9800GTX+ old graphics card] NVidia GeForce 8800 GTX [as of 5/9/09]; Windows XP Pro SP3
Lenovo Think Pad: Windows 6.29 w/ SMP; Windows GPU Console 6.20r1 systray; Intel QX9300; NVIDIA Quadro FX-3700M; Windows XP Professional
Location: SF Peninsula

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by anko1 »

I "misspoke." I was referring to the % ASSigned column, which upon closer reading doesn't actually mean what I thought it did. <blush> I ended up losing another two units: the 4419 that expired and the unit it killed b/c autosend ran into an expired unit. I went ahead and deleted the last two I had, which were close to expiring, rather than miss the deadline and loose two more. I suspect that you're right - the units are so small that the servers get overloaded with the returns. They go back [or try to] almost as fast as they get sent. ;-)
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by toTOW »

122.78 is currently in Reject mode :(
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
TommyHicks
Posts: 2
Joined: Wed Aug 13, 2008 10:52 pm

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by TommyHicks »

I've got 14 computers folding and I'm considering shutting them down as far as folding is concerned. Every one of them has completed work units that won't upload. When the time limit expires, the current work unit is lost with a "corrupted core". The listing below is from a computer with 6 completed work units that it cannot upload plus it can't get any work.

Is there anyone left at Stanford that gives a damn?


[15:11:09] + Attempting to send results
[15:11:09] - Couldn't send HTTP request to server
[15:11:09] + Could not connect to Work Server (results)
[15:11:09] (171.64.122.72:8080)
[15:11:09] - Error: Could not transmit unit 00 (completed November 10) to work server.


[15:11:09] + Attempting to send results
[15:11:10] - Couldn't send HTTP request to server
[15:11:10] + Could not connect to Work Server (results)
[15:11:10] (171.67.108.17:8080)
[15:11:10] Could not transmit unit 00 to Collection server; keeping in queue.


[15:11:10] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.122.72:8080)
[15:11:11] - Error: Could not transmit unit 02 (completed November 11) to work server.


[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.67.108.17:8080)
[15:11:11] Could not transmit unit 02 to Collection server; keeping in queue.


[15:11:11] + Attempting to send results
[15:11:11] - Couldn't send HTTP request to server
[15:11:11] + Could not connect to Work Server (results)
[15:11:11] (171.64.65.65:8080)
[15:11:11] - Error: Could not transmit unit 03 (completed November 13) to work server.


[15:11:11] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (171.67.108.25:8080)
[15:11:12] Could not transmit unit 03 to Collection server; keeping in queue.


[15:11:12] + Attempting to send results
[15:11:12] - Couldn't send HTTP request to server
[15:11:12] + Could not connect to Work Server (results)
[15:11:12] (:8080)
[15:11:12] - Error: Could not transmit unit 04 (completed November 13) to work server.


[15:11:12] + Attempting to send results
[15:11:13] - Couldn't send HTTP request to server
[15:11:13] + Could not connect to Work Server (results)
[15:11:13] (171.67.108.17:8080)
[15:11:13] Could not transmit unit 04 to Collection server; keeping in queue.


[15:11:13] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.64.65.111:8080)
[15:11:14] - Error: Could not transmit unit 08 (completed November 10) to work server.


[15:11:14] + Attempting to send results
[15:11:14] - Couldn't send HTTP request to server
[15:11:14] + Could not connect to Work Server (results)
[15:11:14] (171.67.108.17:8080)
[15:11:14] Could not transmit unit 08 to Collection server; keeping in queue.


[15:11:14] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.64.122.72:8080)
[15:11:15] - Error: Could not transmit unit 09 (completed November 10) to work server.


[15:11:15] + Attempting to send results
[15:11:15] - Couldn't send HTTP request to server
[15:11:15] + Could not connect to Work Server (results)
[15:11:15] (171.67.108.17:8080)
[15:11:15] Could not transmit unit 09 to Collection server; keeping in queue.
[15:28:10] + Attempting to get work packet
[15:28:10] - Connecting to assignment server
[15:28:11] - Successful: assigned to (171.64.65.65).
[15:28:11] + News From Folding@Home: Welcome to Folding@Home
[15:28:11] Loaded queue successfully.
[15:28:11] - Couldn't send HTTP request to server
[15:28:11] (Got status 503)
[15:28:11] + Could not connect to Work Server
[15:28:11] - Error: Attempt #12 to get work failed, and no other work to do.
Waiting before retry.
[16:16:18] + Attempting to get work packet
[16:16:18] - Connecting to assignment server
[16:16:18] - Successful: assigned to (171.64.65.65).
[16:16:18] + News From Folding@Home: Welcome to Folding@Home
[16:16:18] Loaded queue successfully.
[16:16:19] - Couldn't send HTTP request to server
[16:16:19] (Got status 503)
[16:16:19] + Could not connect to Work Server
[16:16:19] - Error: Attempt #13 to get work failed, and no other work to do.
Waiting before retry.
[17:04:22] + Attempting to get work packet
[17:04:22] - Connecting to assignment server
[17:04:22] - Successful: assigned to (171.64.122.72).
[17:04:22] + News From Folding@Home: Welcome to Folding@Home
[17:04:22] Loaded queue successfully.
[17:04:23] - Couldn't send HTTP request to server
[17:04:23] (Got status 503)
[17:04:23] + Could not connect to Work Server
[17:04:23] - Error: Attempt #14 to get work failed, and no other work to do.
Waiting before retry.
[17:52:36] + Attempting to get work packet
[17:52:36] - Connecting to assignment server
[17:52:36] - Successful: assigned to (171.64.65.65).
[17:52:36] + News From Folding@Home: Welcome to Folding@Home
[17:52:37] Loaded queue successfully.
[17:52:37] - Couldn't send HTTP request to server
[17:52:37] (Got status 503)
[17:52:37] + Could not connect to Work Server
[17:52:37] - Error: Attempt #15 to get work failed, and no other work to do.
Waiting before retry.
mikesmusic
Posts: 8
Joined: Tue Sep 09, 2008 8:22 pm

Re: vsp05, vsp11, and vsp15 (171.64.122.72/78/82) down

Post by mikesmusic »

TommyHicks wrote: Is there anyone left at Stanford that gives a damn?
That has to be a fair question Tommy. Here we are 11 days later and no response is your question.

The server stats show that vsp5,11,15 are verry verry busy indeed.

Here is a typical ping plotter response from Vsp05. vsp11 and 15 are the same

Code: Select all

Target Name: vsp05
         IP: 171.65.122.78
  Date/Time: 24/11/2008 17:44:36

 1    1 ms  private
 2   28 ms  private
 3   26 ms  ge1-3-0-100.core1.ixn.dub.stisp.net [84.203.130.9]
 4   30 ms  ge1-3-0-98.core1.tcy.dub.stisp.net [84.203.130.2]
 5   37 ms  [195.66.224.185]
 6   49 ms  te2-7.ccr02.ams03.atlas.cogentco.com [130.117.1.169]
 7  129 ms  te7-3.mpd01.ymq02.atlas.cogentco.com [130.117.0.69]
 8  137 ms  te3-7.mpd01.yyz02.atlas.cogentco.com [154.54.7.213]
 9  141 ms  te7-8.ccr02.ord01.atlas.cogentco.com [154.54.7.73]
10  151 ms  te4-3.ccr02.mci01.atlas.cogentco.com [154.54.6.201]
11  190 ms  te8-4.ccr02.sfo01.atlas.cogentco.com [154.54.24.117]
12  189 ms  te4-4.mpd01.sjc04.atlas.cogentco.com [154.54.7.174]
13  187 ms  Stanford_University2.demarc.cogentco.com [66.250.7.138]
14  196 ms  bbrb-isp.Stanford.EDU [171.64.1.155]
15   *       [-]
The [-] on line 15 means destination unreachable..
That means your work packet has no chance


Do they care? Its a mystery.
Post Reply