171.64.65.102 broken, 171.67.108.17 on standby?

Moderators: Site Moderators, FAHC Science Team

Post Reply
debs3759
Posts: 138
Joined: Fri Oct 07, 2011 3:29 am

171.64.65.102 broken, 171.67.108.17 on standby?

Post by debs3759 »

I'm unable to upload a wu that failed as the WS is returning 0 bytes. The CS appears to have ben on standby since at least august, so it also cannot accept the results.

My log for the wu (using the latest beta, and it's a gpu wu) since I last restarted the console is:

Code: Select all

02:23:00:WARNING: WorkServer connection failed on port 8080 trying 80
02:23:00:Connecting to 171.67.108.17:80
02:23:01:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:23:02:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:23:02:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:23:02:Connecting to 171.64.65.102:8080
02:23:02:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1
02:23:02:Started thread 9 on PID 4400
02:23:02:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:23:02:Trying to send results to collection server
02:23:02:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:23:02:Connecting to 171.67.108.17:8080
02:23:04:WARNING: WorkServer connection failed on port 8080 trying 80
02:23:04:Connecting to 171.67.108.17:80
02:23:05:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:24:02:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:24:02:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:24:02:Connecting to 171.64.65.102:8080
02:24:02:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:24:02:Trying to send results to collection server
02:24:02:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:24:02:Connecting to 171.67.108.17:8080
02:24:04:WARNING: WorkServer connection failed on port 8080 trying 80
02:24:04:Connecting to 171.67.108.17:80
02:24:05:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:25:39:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:25:39:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:25:39:Connecting to 171.64.65.102:8080
02:25:39:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:25:39:Trying to send results to collection server
02:25:39:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:25:39:Connecting to 171.67.108.17:8080
02:25:41:WARNING: WorkServer connection failed on port 8080 trying 80
02:25:41:Connecting to 171.67.108.17:80
02:25:42:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:28:16:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:28:16:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:28:16:Connecting to 171.64.65.102:8080
02:28:17:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:28:17:Trying to send results to collection server
02:28:17:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:28:17:Connecting to 171.67.108.17:8080
02:28:18:WARNING: WorkServer connection failed on port 8080 trying 80
02:28:18:Connecting to 171.67.108.17:80
02:28:19:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:32:30:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:32:30:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:32:30:Connecting to 171.64.65.102:8080
02:32:31:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:32:31:Trying to send results to collection server
02:32:31:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:32:31:Connecting to 171.67.108.17:8080
02:32:32:WARNING: WorkServer connection failed on port 8080 trying 80
02:32:32:Connecting to 171.67.108.17:80
02:32:34:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:39:22:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:39:22:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:39:22:Connecting to 171.64.65.102:8080
02:39:22:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:39:22:Trying to send results to collection server
02:39:22:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:39:22:Connecting to 171.67.108.17:8080
02:39:24:WARNING: WorkServer connection failed on port 8080 trying 80
02:39:24:Connecting to 171.67.108.17:80
02:39:25:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
02:50:27:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
02:50:28:Unit 01: Uploading 5.72KiB to 171.64.65.102
02:50:28:Connecting to 171.64.65.102:8080
02:50:28:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
02:50:28:Trying to send results to collection server
02:50:28:Unit 01: Uploading 5.72KiB to 171.67.108.17
02:50:28:Connecting to 171.67.108.17:8080
02:50:29:WARNING: WorkServer connection failed on port 8080 trying 80
02:50:29:Connecting to 171.67.108.17:80
02:50:31:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
03:08:24:Sending unit results: id:01 state:SEND project:5733 run:3 clone:577 gen:736 core:0x11 unit:0x101901c74e8e16dd02e0024100031665
03:08:24:Unit 01: Uploading 5.72KiB to 171.64.65.102
03:08:24:Connecting to 171.64.65.102:8080
03:08:25:WARNING: Exception: Failed to send results to work server: Received short response, expected 512 bytes, got 0
03:08:25:Trying to send results to collection server
03:08:25:Unit 01: Uploading 5.72KiB to 171.67.108.17
03:08:25:Connecting to 171.67.108.17:8080
03:08:26:WARNING: WorkServer connection failed on port 8080 trying 80
03:08:26:Connecting to 171.67.108.17:80
03:08:27:ERROR: Exception: Failed to connect to 171.67.108.17:80: No connection could be made because the target machine actively refused it.
The server status page shows the WS as working and the CS as being on standby. Going to the individual server pages, again it shows the WS as working, but for me it is returning a 0 byte reply. The CS shows as being on standby since at least august 26, so it looks like it might be a long term issue with the CS.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.64.65.102 broken, 171.67.108.17 on standby?

Post by bruce »

Welcome to foldingforum.org, debs3759.

Three comments:

1) This is probably due to the V7 bug reported in Ticket #615. I expect that if you scroll back to the point in the log when that WU finished (or perhaps you have restarted so it's in an earlier log found in the "logs" directory) you'll find the WU ended with an error. Development informs us that this will be fixed in a future version V7.1.37 or greater.

2) Server 171.64.65.102 appears to be functioning normally.

3) Most of the Collection servers, including 171.67.108.17, have been non-functional for a long time. For them to work again, the server code on both the Work Server (including 171.64.65.102) and the CS need to be upgraded. This is an on-going process but it's proceeding rather slowly since many of the WS are functioning reliably and when that's true, the CS is unnecessary.

If you do find the proper information in a log showing the termination of the WU, I'd appreciate you posting it here to confirm my suspicions in comment 1.

For more information on comment 3, see the "do this first" topic at the top of this forum.
Post Reply