No credit from 155.247.166.220 and 155.247.166.219

Moderators: Site Moderators, PandeGroup

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby DrBB1 » Tue Dec 05, 2017 5:14 am

Brian if the server status report at http://fah-web.stanford.edu/pybeta/serverstat.html is correct, the "Connect" status for 155.247.166.220 is currently "Reject." That seems to be a separate issue from the stats, as you surmised.
========
DrBB1
DrBB1
 
Posts: 143
Joined: Wed Mar 26, 2008 12:30 am
Location: SE PA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby ThunderRd » Tue Dec 05, 2017 10:50 am

My error looks a bit different; I have not seen anyone indicate this 'Received short response, expected 512 bytes, got 13'

Code: Select all
10:42:00:WU02:FS00:0xa7:    OS Arch: AMD64
10:42:00:WU02:FS00:0xa7:********************************************************************************
10:42:00:WU02:FS00:0xa7:Project: 13747 (Run 145, Clone 4, Gen 10)
10:42:00:WU02:FS00:0xa7:Unit: 0x0000000a0002894b59d561b8d5507714
10:42:00:WU02:FS00:0xa7:Digital signatures verified
10:42:00:WU02:FS00:0xa7:Calling: mdrun -s frame10.tpr -o frame10.trr -cpi state.cpt -cpt 15 -nt 4
10:42:00:WU02:FS00:0xa7:Steps: first=25000000 total=2500000
10:42:00:WU02:FS00:0xa7:Completed 2035942 out of 2500000 steps (81%)
10:42:31:WU01:FS00:Upload 4.44%
10:42:35:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 13
10:42:35:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14007 run:0 clone:92 gen:9 core:0xa4 unit:0x0000000b0002894c59e4d0b678c33534
10:42:35:WU01:FS00:Uploading 4.22MiB to 155.247.166.220
10:42:35:WU01:FS00:Connecting to 155.247.166.220:8080
10:42:35:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
10:42:35:WU01:FS00:Connecting to 155.247.166.220:80
10:42:36:WU02:FS00:0xa7:Completed 2050000 out of 2500000 steps (82%)
10:43:09:WU01:FS00:Upload 4.44%
10:43:13:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 13
10:43:35:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14007 run:0 clone:92 gen:9 core:0xa4 unit:0x0000000b0002894c59e4d0b678c33534
10:43:35:WU01:FS00:Uploading 4.22MiB to 155.247.166.220
10:43:35:WU01:FS00:Connecting to 155.247.166.220:8080
10:43:35:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
10:43:35:WU01:FS00:Connecting to 155.247.166.220:80
10:43:45:WU02:FS00:0xa7:Completed 2075000 out of 2500000 steps (83%)
10:44:09:WU01:FS00:Upload 4.44%
10:44:13:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 13
10:45:03:WU02:FS00:0xa7:Completed 2100000 out of 2500000 steps (84%)
10:45:12:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14007 run:0 clone:92 gen:9 core:0xa4 unit:0x0000000b0002894c59e4d0b678c33534
10:45:12:WU01:FS00:Uploading 4.22MiB to 155.247.166.220
10:45:12:WU01:FS00:Connecting to 155.247.166.220:8080
10:45:13:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
10:45:13:WU01:FS00:Connecting to 155.247.166.220:80
10:46:06:WU02:FS00:0xa7:Completed 2125000 out of 2500000 steps (85%)
10:47:11:WU02:FS00:0xa7:Completed 2150000 out of 2500000 steps (86%)
ThunderRd
 
Posts: 146
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby Joe_H » Tue Dec 05, 2017 4:25 pm

@absolutefunk WU's always get returned to the WS they came from first, they only go to the designated CS if not able to upload to the WS. Continuation of the log should show the client sending the WU back to 155.247.166.219 after trying 155.247.166.220 for most projects running at Temple. If your client shows 0.0.0.0 for the CS in FAHControl, then the WU will have to wait for 155.247.166.220 to come back online for the WU to be accepted.

As mentioned, 155.247.166.220 is currently down and not accepting connections. This is a different issue than the stats not being collected for the database.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 3899
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby Joe_H » Tue Dec 05, 2017 4:35 pm

ThunderRd wrote:My error looks a bit different; I have not seen anyone indicate this 'Received short response, expected 512 bytes, got 13'

That error message usually indicates a connection being blocked somewhere between the client on your system and the server it is connecting with. Most often it has been caused by anti-malware apps or firewall settings on the system or network the client is located on. But it could be elsewhere. If you are not getting this error message on other client connections to the servers, it might be related to whatever caused the 155.247.166.220 WS to go down overnight.
Joe_H
Site Admin
 
Posts: 3899
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby GreyWhiskers » Wed Dec 06, 2017 12:00 am

Just to pile on, at least since the Stanford stats server went down at Thanksgiving time, I've been missing ALL credits from my CPU slots. The culprit in my case is the server at 155.247.166.219.

As the Temple IT folks are working on the firewall issue, I presume that when that happens, the whole logjam of credits will be forwarded to Stanford's stats servers, and the third party stats servers like EOC and Kakao will be able to report them as lump sums.

I've been running CPU slots on two of my Core i7 machines, both with AVX enabled, and one GPU enabled slot. Since the Stanford stats server came back up, I've been credited (as reported in my EOC stats) ONLY with results from the GPU folding, and none of the CPU folding.

FYI, I've had no folding or error issues at all in this CPU folding. I finish and upload a WU, the log reports the credit, I get a new CPU WU, fold it, complete and upload it, with the FAH control log reporting credit, etc., etc.

Here's a sample over the last couple of days of the completions and credits from the FAH Control log for one of the two computers.

Note that I was pruning the lines from the log I wanted to display here, and didn't include the IP address of the collection server in the first two instances. For the subsequent instances, it was included. Also note the missing items are a combo of A4 and A7 core WUs.

4:27:34 :WU00 :FS00 :Sending unit results: id:00 state:SEND error:NO_ERROR project:13739 run:154 clone:4 gen:7 core:0xa7 unit:0x000000070002894b59d5d7fdd885317e
4:27:36 :WU00 :FS00 :Final credit estimate, 1622.00 points

5:53:14 :WU01 :FS00 :Sending unit results: id:01 state:SEND error:NO_ERROR project:13743 run:11 clone:5 gen:4 core:0xa7 unit:0x000000040002894b59d5a38b00fbcdad
5:53:19 :WU01 :FS00 :Final credit estimate, 1659.00 points

12:37:24 :WU00 :FS00 :Sending unit results: id:00 state:SEND error:NO_ERROR project:8633 run:2 clone:606 gen:12 core:0xa4 unit:0x0000000f0002894b57f6f46806e6fe2b
12:37:24 :WU00 :FS00 :Uploading 3.22MiB to 155.247.166.219
12:37:32 :WU00 :FS00 :Final credit estimate, 4698.00 points

14:04:02 :WU01 :FS00 :Sending unit results: id:01 state:SEND error:NO_ERROR project:13744 run:122 clone:5 gen:0 core:0xa7 unit:0x000000000002894b59d58ba4e7b76a70
14:04:02 :WU01 :FS00 :Uploading 1.66MiB to 155.247.166.219
14:04:06 :WU01 :FS00 :Final credit estimate, 1619.00 points

20:46:01 :WU00 :FS00 :Sending unit results: id:00 state:SEND error:NO_ERROR project:8632 run:5 clone:215 gen:7 core:0xa4 unit:0x000000090002894b57f6f5bdd7841ccc
20:46:01 :WU00 :FS00 :Uploading 2.45MiB to 155.247.166.219
20:46:04 :WU00 :FS00 :Final credit estimate, 4644.00 points

22:09:50 :WU01 :FS00 :Sending unit results: id:01 state:SEND error:NO_ERROR project:13740 run:59 clone:4 gen:5 core:0xa7 unit:0x000000060002894b59d631776ececd66
22:09:50 :WU01 :FS00 :Uploading 1.66MiB to 155.247.166.219
22:09:55 :WU01 :FS00 :Final credit estimate, 1645.00 points
User avatar
GreyWhiskers
 
Posts: 766
Joined: Mon Oct 25, 2010 5:57 am
Location: Saratoga, California USA

Failed to send : Failed to connect to 155.247.166.220:80

Postby parkut » Wed Dec 06, 2017 5:05 pm

I have (4) machines that cannot connect to this server, FAHControl status tab is showing collection server: 0.0.0.0

15:38:26:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
15:40:34:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.220:80: Connection timed out

project:8652 run:2296 clone:0 gen:11
project:8642 run:9696 clone:1 gen:21
project:14012 run:0 clone:311 gen:3
project:13786 run:4 clone:48 gen:8
User avatar
parkut
 
Posts: 396
Joined: Tue Feb 12, 2008 7:33 am
Location: SE Michigan, USA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby montemac » Wed Dec 06, 2017 8:24 pm

Does this mean anything good? Just copied from the Server Status database:
14 155.247.166.219 vav3 vvoelz SMP full Accepting
15 155.247.166.220 vav4 vvoelz SMP full Accepting

Edit: Just noticed, nothing between the words "Accepting" and the last column, "OS_Weight_Program_Port" whatever that means.
Folding on 4 pc's
Image
montemac
 
Posts: 37
Joined: Wed Oct 10, 2012 11:49 am
Location: Richmond, VA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby Joe_H » Wed Dec 06, 2017 9:58 pm

montemac wrote:Does this mean anything good? Just copied from the Server Status database:
14 155.247.166.219 vav3 vvoelz SMP full Accepting
15 155.247.166.220 vav4 vvoelz SMP full Accepting

Edit: Just noticed, nothing between the words "Accepting" and the last column, "OS_Weight_Program_Port" whatever that means.


Yes, that is good. The only column that we are looking to see have data in again is the one headed as WUs Rcv. Many of the rest no longer apply, they date back to older versions of the Work Server software and the information supplied to the Server Status page by those versions.

The number in Wus Rcv column tracks how many WU's that the server has credit information ready to be uploaded to the stats database. The collection script runs once an hour to pick up the log with the credits. Once the problem with that connection is resolved, that column should start showing information again and WU credits show show up from these two servers.

Off topic a bit, but that last column lists which OS's a project will be assigned to, the priority, and the ports over which they will be assigned. Hover over the blue "i" at the top of the column and it will give you information about the entries there.
Joe_H
Site Admin
 
Posts: 3899
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: Failed to send : Failed to connect to 155.247.166.220:80

Postby Joe_H » Wed Dec 06, 2017 10:00 pm

parkut wrote:I have (4) machines that cannot connect to this server, FAHControl status tab is showing collection server: 0.0.0.0

15:38:26:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
15:40:34:WARNING:WU00:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.220:80: Connection timed out

project:8652 run:2296 clone:0 gen:11
project:8642 run:9696 clone:1 gen:21
project:14012 run:0 clone:311 gen:3
project:13786 run:4 clone:48 gen:8


This WS was restarted earlier today. Your WU's should upload now if they have not already done so. I had some waiting to upload, they had already done so when I checked around 1 PM EST.
Joe_H
Site Admin
 
Posts: 3899
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby DrBB1 » Thu Dec 07, 2017 11:37 pm

FWIW, I don't believe any of the WUs I completed that were sent to these servers have (yet) been credited, although the issue was apparently fixed over 24 hours ago. Is this a problem to investigate, or is this simply an expected lag in receiving credit? If the latter, about how long should it take before everything is all caught up?
DrBB1
 
Posts: 143
Joined: Wed Mar 26, 2008 12:30 am
Location: SE PA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby kofther » Thu Dec 07, 2017 11:50 pm

Are there 2 issues being referenced in this thread? Is the Temple firewall issue been resolved?
kofther
 
Posts: 6
Joined: Thu Nov 30, 2017 1:00 pm

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby Joe_H » Fri Dec 08, 2017 7:33 am

kofther wrote:Are there 2 issues being referenced in this thread? Is the Temple firewall issue been resolved?

Yes. This topic started out about no credits from the Temple servers, which is possibly due to firewall issues. That has not been resolved yet.

A few posts back a problem uploading to one of the two servers there was brought up, and it was asked if that was related to the first problem. It might have been, all that is certain is that the server stopped accepting connections until it was restarted.
Joe_H
Site Admin
 
Posts: 3899
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby kofther » Fri Dec 08, 2017 4:38 pm

Thanks Joe. Is there an estimate fix date or update for the (what I'll call) the firewall issue?
kofther
 
Posts: 6
Joined: Thu Nov 30, 2017 1:00 pm

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby JimboPalmer » Fri Dec 08, 2017 4:43 pm

I do not even think they have determined if Temple is blocking sending, or Stanford is blocking receiving. Just a lot of IT departments potentially pointing fingers until they resolve that.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 535
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: No credit from 155.247.166.220 and 155.247.166.219

Postby montemac » Fri Dec 08, 2017 5:52 pm

It would be nice if the Temple and Stanford people were in on this conversation.
montemac
 
Posts: 37
Joined: Wed Oct 10, 2012 11:49 am
Location: Richmond, VA

PreviousNext

Return to Issues with a specific server

Who is online

Users browsing this forum: No registered users and 1 guest

cron