Failed to connect to 171.64.65.104:80

Moderators: Site Moderators, FAHC Science Team

Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Failed to connect to 171.64.65.104:80

Post by Joe_H »

Why the remaining Core_15 projects are available, you opt in by adding the following settings to the GPU slot:

max-packet-size = small

client-type - beta

The WU's are not actually beta, this combination of settings is just used to indicate a willingness to process them. Core_15 is end of life, so once these projects finish up no further projects will be using it. These projects also only get the base points indicated in the Project Summary page, no QRB.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
DarkFoss
Posts: 103
Joined: Fri Apr 16, 2010 11:43 pm
Hardware configuration: AMD 5800X3D Asus ROG Strix X570-E Gaming WiFi II bios 5003 G-Skill TridentZ Neo 3600mhz Asrock Tachi RX 7900XTX Corsair rm850x psu Asus PG32UQXR EK Elite 360 D-rgb aio Win 11pro/Kubuntu 22.04.4 LTS UPS BX1500G
Location: Galifrey

Re: Failed to connect to 171.64.65.104:80

Post by DarkFoss »

Great to see it's up and running . Unfortunately server didn't like the results

Code: Select all

06:53:27:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9211 run:2 clone:11 gen:22 core:0x21 unit:0x00000075664f2dd055ee292a0f6788f6
06:53:28:WU01:FS00:Uploading 17.50MiB to 171.64.65.104
06:53:28:WU01:FS00:Connecting to 171.64.65.104:8080
06:53:34:WU01:FS00:Upload 27.86%
06:53:40:WU01:FS00:Upload 52.50%
06:53:46:WU01:FS00:Upload 77.14%
06:53:53:WU01:FS00:Upload complete
06:53:53:WU01:FS00:Server responded WORK_QUIT (404)
06:53:53:WARNING:WU01:FS00:Server did not like results, dumping
06:53:53:WU01:FS00:Cleaning up
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failed to connect to 171.64.65.104:80

Post by bruce »

It's not obvious at all. The concept was adopted without requiring a new client version so it is, in fact, obscure. You have to make two specific settings:

client-type=beta
&
max-packet-size=small

The assignments are not necessarily small and they're certainly not beta projects.
PS3EdOlkkola
Posts: 184
Joined: Tue Aug 26, 2014 9:48 pm
Hardware configuration: 10 SMP folding slots on Intel Phi "Knights Landing" system, configured as 24 CPUs/slot
9 AMD GPU folding slots
31 Nvidia GPU folding slots
50 total folding slots
Average PPD/slot = 459,500
Location: Dallas, TX

Re: Failed to connect to 171.64.65.104:80

Post by PS3EdOlkkola »

Looks like this collection server is having issues accepting work units:

Code: Select all

21:24:14:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:9209 run:53 clone:1 gen:44 core:0x21 unit:0x0000005a664f2dd056fb27b4f3e98df7
21:24:14:WU00:FS01:Uploading 37.95MiB to 171.64.65.104
21:24:14:WU00:FS01:Connecting to 171.64.65.104:8080
21:24:16:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
21:24:16:WU00:FS01:Connecting to 171.64.65.104:80
21:24:17:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.104:80: No connection could be made because the target machine actively refused it.
21:24:17:WU00:FS01:Trying to send results to collection server
21:24:17:WU00:FS01:Uploading 37.95MiB to 171.65.103.160
21:24:17:WU00:FS01:Connecting to 171.65.103.160:8080
21:24:23:WU00:FS01:Upload 0.82%
21:24:29:WU00:FS01:Upload 1.48%
21:24:35:WU00:FS01:Upload 2.47%
.
.
21:35:09:WU00:FS01:Upload 99.47%
21:35:14:WU00:FS01:Upload complete
21:35:14:WU00:FS01:Server responded PLEASE_WAIT (464)
21:35:14:WARNING:WU00:FS01:Failed to send results, will try again later
Has repeated the above sequence 5 times so far. Other GPU slots on the same machine have uploaded to other collection servers without any issues. Looking at that particular server status, it's configured to accept "classic" not GPU, but Status and Collect are "accept" and "Accepting", respectively.

Can someone please take a look?
Image
Hardware config viewtopic.php?f=66&t=17997&p=277235#p277235
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failed to connect to 171.64.65.104:80

Post by bruce »

@PS3EdOlkkola
Please scroll back to the point at which project:9209 run:53 clone:1 gen:44 reached 100%. (If FAH has been restarted, you may need to look in the "logs" subdirectory of FAH's data directory.

I need to see the FIRST time or two that it tried to upload.

I believe this is a bug in the server code but I've never gotten enough information to prove that to Development.
PS3EdOlkkola
Posts: 184
Joined: Tue Aug 26, 2014 9:48 pm
Hardware configuration: 10 SMP folding slots on Intel Phi "Knights Landing" system, configured as 24 CPUs/slot
9 AMD GPU folding slots
31 Nvidia GPU folding slots
50 total folding slots
Average PPD/slot = 459,500
Location: Dallas, TX

Re: Failed to connect to 171.64.65.104:80

Post by PS3EdOlkkola »

Thanks bruce. Below are the first few lines the first time the wu attempted an upload

Code: Select all

20:23:26:WU00:FS01:0x21:Completed 2500000 out of 2500000 steps (100%)
20:23:30:WU00:FS01:0x21:Saving result file logfile_01.txt
20:23:30:WU00:FS01:0x21:Saving result file checkpointState.xml
20:23:30:WU00:FS01:0x21:Saving result file checkpt.crc
20:23:30:WU00:FS01:0x21:Saving result file log.txt
20:23:30:WU00:FS01:0x21:Saving result file positions.xtc
20:23:30:WU00:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
20:23:31:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
20:23:31:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:9209 run:53 clone:1 gen:44 core:0x21 unit:0x0000005a664f2dd056fb27b4f3e98df7
20:23:31:WU00:FS01:Uploading 37.95MiB to 171.64.65.104
20:23:31:WU00:FS01:Connecting to 171.64.65.104:8080
20:23:50:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
20:23:50:WU00:FS01:Connecting to 171.64.65.104:80
20:23:51:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 171.64.65.104:80: No connection could be made because the target machine actively refused it.
20:23:51:WU00:FS01:Trying to send results to collection server
20:23:51:WU00:FS01:Uploading 37.95MiB to 171.65.103.160
20:23:51:WU00:FS01:Connecting to 171.65.103.160:8080
20:23:57:WU00:FS01:Upload 1.48%
20:24:03:WU00:FS01:Upload 2.80%
20:24:09:WU00:FS01:Upload 3.95%
....
20:33:04:WU00:FS01:Upload 99.63%
20:33:07:WU00:FS01:Upload complete
20:33:07:WU00:FS01:Server responded PLEASE_WAIT (464)
20:33:07:WARNING:WU00:FS01:Failed to send results, will try again later
20:33:07:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:9209 run:53 clone:1 gen:44 core:0x21 unit:0x0000005a664f2dd056fb27b4f3e98df7
From what I can tell, the initial connection attempts to connect to 171.64.65.104:8080, which immediately fails, then attempts to connect to the same IP address on port 80, which is also refused by the server. It then picks a different IP address:port at 171.65.103.160:8080 and proceeds to upload to that address until the Server at that address responds with PLEASE_WAIT (464) and then immediately fails. I think the error is when the GPU collection server 171.64.65.104 Status is "full" and the Connect is "reject" (which it currently is indicating), instead of failing over to another GPU collection server, it fails over to a "classic" server (171.65.103.160:8080) which can't interpret the wu (looking for SMP and gets a GPU unit).

Let me know if you need anything else.
Image
Hardware config viewtopic.php?f=66&t=17997&p=277235#p277235
RABishop
Posts: 73
Joined: Thu May 07, 2015 2:42 am

Server 171.65.103.160

Post by RABishop »

I have two jobs on two different computers from which the above server is not accepting the "send" product. On the computer I am using to post this, the same GPU is working a new job that is about 27.6% finished, with 3 hours and 23 minutes to go. My math here means it has been waiting at his point about 40 minutes. Hard to get exact, since it keeps changing. The other system, at last glance has 5 hours 28 minutes remaining, and is only 39% done. Which could mean it has been waiting much longer: about 4.25 hours into waiting. I won't bother copying the log from the other system, since that is a bunch of work. Both machines are connected through the same wired network to the internet, and both have no trouble on Firefox pulling up my homepage. So it isn't my network.

This is about an hour of the stuff from the log on this machine:

Code: Select all

22:44:53:WU04:FS02:Uploading 37.95MiB to 171.64.65.104
22:44:53:WU04:FS02:Connecting to 171.64.65.104:8080
22:44:54:WARNING:WU04:FS02:WorkServer connection failed on port 8080 trying 80
22:44:54:WU04:FS02:Connecting to 171.64.65.104:80
22:44:55:WARNING:WU04:FS02:Exception: Failed to send results to work server: Failed to connect to 171.64.65.104:80: Connection refused
22:44:55:WU04:FS02:Trying to send results to collection server
22:44:55:WU04:FS02:Uploading 37.95MiB to 171.65.103.160
22:44:55:WU04:FS02:Connecting to 171.65.103.160:8080
22:45:01:WU04:FS02:Upload 9.06%
22:45:07:WU04:FS02:Upload 17.13%
22:45:13:WU04:FS02:Upload 25.03%
22:45:19:WU04:FS02:Upload 33.27%
22:45:25:WU04:FS02:Upload 41.34%
22:45:31:WU04:FS02:Upload 49.41%
22:45:37:WU04:FS02:Upload 57.48%
22:45:43:WU04:FS02:Upload 65.55%
22:45:49:WU04:FS02:Upload 73.62%
22:45:50:WU02:FS00:0xa4:Completed 56000 out of 160000 steps  (35%)
22:45:55:WU04:FS02:Upload 81.69%
22:46:01:WU04:FS02:Upload 89.27%
22:46:07:WU04:FS02:Upload 97.34%
22:46:09:WU04:FS02:Upload complete
22:46:10:WU04:FS02:Server responded PLEASE_WAIT (464)
22:46:10:WARNING:WU04:FS02:Failed to send results, will try again later
22:46:26:WU01:FS02:0x21:Completed 220000 out of 2000000 steps (11%)
22:46:30:WU00:FS03:0x21:Completed 4600000 out of 5000000 steps (92%)
22:47:05:WU03:FS01:0x21:Completed 530000 out of 1000000 steps (53%)
22:47:07:WU02:FS00:0xa4:Completed 57600 out of 160000 steps  (36%)
22:48:23:WU02:FS00:0xa4:Completed 59200 out of 160000 steps  (37%)
22:49:09:WU01:FS02:0x21:Completed 240000 out of 2000000 steps (12%)
22:49:32:WU02:FS00:0xa4:Completed 60800 out of 160000 steps  (38%)
22:50:02:WU00:FS03:0x21:Completed 4650000 out of 5000000 steps (93%)
22:50:12:WU03:FS01:0x21:Completed 540000 out of 1000000 steps (54%)
22:50:41:WU02:FS00:0xa4:Completed 62400 out of 160000 steps  (39%)
22:51:49:WU02:FS00:0xa4:Completed 64000 out of 160000 steps  (40%)
22:51:59:WU01:FS02:0x21:Completed 260000 out of 2000000 steps (13%)
22:52:58:WU02:FS00:0xa4:Completed 65600 out of 160000 steps  (41%)
22:53:16:WU03:FS01:0x21:Completed 550000 out of 1000000 steps (55%)
22:53:32:WU00:FS03:0x21:Completed 4700000 out of 5000000 steps (94%)
22:54:08:WU02:FS00:0xa4:Completed 67200 out of 160000 steps  (42%)
22:54:43:WU01:FS02:0x21:Completed 280000 out of 2000000 steps (14%)
22:55:17:WU02:FS00:0xa4:Completed 68800 out of 160000 steps  (43%)
22:56:26:WU02:FS00:0xa4:Completed 70400 out of 160000 steps  (44%)
22:56:30:WU03:FS01:0x21:Completed 560000 out of 1000000 steps (56%)
22:57:02:WU00:FS03:0x21:Completed 4750000 out of 5000000 steps (95%)
22:57:26:WU01:FS02:0x21:Completed 300000 out of 2000000 steps (15%)
22:57:35:WU02:FS00:0xa4:Completed 72000 out of 160000 steps  (45%)
22:58:44:WU02:FS00:0xa4:Completed 73600 out of 160000 steps  (46%)
22:59:35:WU03:FS01:0x21:Completed 570000 out of 1000000 steps (57%)
22:59:53:WU02:FS00:0xa4:Completed 75200 out of 160000 steps  (47%)
23:00:10:WU01:FS02:0x21:Completed 320000 out of 2000000 steps (16%)
23:00:33:WU00:FS03:0x21:Completed 4800000 out of 5000000 steps (96%)
23:01:02:WU02:FS00:0xa4:Completed 76800 out of 160000 steps  (48%)
23:02:11:WU02:FS00:0xa4:Completed 78400 out of 160000 steps  (49%)
23:02:40:WU03:FS01:0x21:Completed 580000 out of 1000000 steps (58%)
23:02:49:WU04:FS02:Sending unit results: id:04 state:SEND error:NO_ERROR project:9209 run:41 clone:7 gen:14 core:0x21 unit:0x0000002e664f2dd056fb27a1d1bc4ae7
23:02:49:WU04:FS02:Uploading 37.95MiB to 171.64.65.104
23:02:49:WU04:FS02:Connecting to 171.64.65.104:8080
23:02:50:WARNING:WU04:FS02:WorkServer connection failed on port 8080 trying 80
23:02:50:WU04:FS02:Connecting to 171.64.65.104:80
23:02:52:WARNING:WU04:FS02:Exception: Failed to send results to work server: Failed to connect to 171.64.65.104:80: Connection refused
23:02:52:WU04:FS02:Trying to send results to collection server
23:02:52:WU04:FS02:Uploading 37.95MiB to 171.65.103.160
23:02:52:WU04:FS02:Connecting to 171.65.103.160:8080
23:02:54:WU01:FS02:0x21:Completed 340000 out of 2000000 steps (17%)
23:02:58:WU04:FS02:Upload 9.06%
23:03:04:WU04:FS02:Upload 17.13%
23:03:10:WU04:FS02:Upload 25.20%
23:03:16:WU04:FS02:Upload 33.27%
23:03:20:WU02:FS00:0xa4:Completed 80000 out of 160000 steps  (50%)
23:03:22:WU04:FS02:Upload 41.50%
23:03:28:WU04:FS02:Upload 49.58%
23:03:34:WU04:FS02:Upload 57.65%
23:03:40:WU04:FS02:Upload 65.72%
23:03:46:WU04:FS02:Upload 73.79%
23:03:52:WU04:FS02:Upload 82.02%
23:03:58:WU04:FS02:Upload 90.09%
23:04:03:WU00:FS03:0x21:Completed 4850000 out of 5000000 steps (97%)
23:04:04:WU04:FS02:Upload 98.16%
23:04:06:WU04:FS02:Upload complete
23:04:06:WU04:FS02:Server responded PLEASE_WAIT (464)
23:04:06:WARNING:WU04:FS02:Failed to send results, will try again later
23:04:30:WU02:FS00:0xa4:Completed 81600 out of 160000 steps  (51%)
23:05:37:WU01:FS02:0x21:Completed 360000 out of 2000000 steps (18%)
23:05:38:WU02:FS00:0xa4:Completed 83200 out of 160000 steps  (52%)
23:05:44:WU03:FS01:0x21:Completed 590000 out of 1000000 steps (59%)
23:06:48:WU02:FS00:0xa4:Completed 84800 out of 160000 steps  (53%)
23:07:35:WU00:FS03:0x21:Completed 4900000 out of 5000000 steps (98%)
23:08:03:WU02:FS00:0xa4:Completed 86400 out of 160000 steps  (54%)
23:08:28:WU01:FS02:0x21:Completed 380000 out of 2000000 steps (19%)
23:08:49:WU03:FS01:0x21:Completed 600000 out of 1000000 steps (60%)
23:09:18:WU02:FS00:0xa4:Completed 88000 out of 160000 steps  (55%)
23:10:30:WU02:FS00:0xa4:Completed 89600 out of 160000 steps  (56%)
23:11:04:WU00:FS03:0x21:Completed 4950000 out of 5000000 steps (99%)
23:11:12:WU01:FS02:0x21:Completed 400000 out of 2000000 steps (20%)
23:11:43:WU02:FS00:0xa4:Completed 91200 out of 160000 steps  (57%)
23:12:04:WU03:FS01:0x21:Completed 610000 out of 1000000 steps (61%)
23:12:53:WU02:FS00:0xa4:Completed 92800 out of 160000 steps  (58%)
23:13:55:WU01:FS02:0x21:Completed 420000 out of 2000000 steps (21%)
23:14:06:WU02:FS00:0xa4:Completed 94400 out of 160000 steps  (59%)
23:14:33:WU00:FS03:0x21:Completed 5000000 out of 5000000 steps (100%)
23:14:34:WU05:FS03:Connecting to 171.67.108.45:80
23:14:34:WU05:FS03:Assigned to work server 171.64.65.92
23:14:34:WU05:FS03:Requesting new work unit for slot 03: RUNNING gpu:2:GM200 [GeForce GTX 980 Ti] from 171.64.65.92
23:14:34:WU05:FS03:Connecting to 171.64.65.92:8080
23:14:35:WU05:FS03:Downloading 2.85MiB
23:14:35:WU00:FS03:0x21:Saving result file logfile_01.txt
23:14:35:WU00:FS03:0x21:Saving result file checkpointState.xml
23:14:35:WU05:FS03:Download complete
23:14:36:WU05:FS03:Received Unit: id:05 state:DOWNLOAD error:NO_ERROR project:9162 run:163 clone:0 gen:352 core:0x18 unit:0x00000194ab40415c56748147b6c09d56
23:14:37:WU00:FS03:0x21:Saving result file checkpt.crc
23:14:37:WU00:FS03:0x21:Saving result file log.txt
23:14:37:WU00:FS03:0x21:Saving result file positions.xtc
23:14:38:WU00:FS03:FahCore returned: FINISHED_UNIT (100 = 0x64)
23:14:38:WU00:FS03:Sending unit results: id:00 state:SEND error:NO_ERROR project:11423 run:5 clone:40 gen:8 core:0x21 unit:0x0000000f8ca304f1571a748ccdcc2acf
23:14:38:WU00:FS03:Uploading 9.35MiB to 140.163.4.241
23:14:38:WU00:FS03:Connecting to 140.163.4.241:8080
23:14:38:WU05:FS03:Starting
23:14:38:WU05:FS03:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18 -dir 05 -suffix 01 -version 704 -lifeline 1812 -checkpoint 30 -gpu 2 -gpu-vendor nvidia
23:14:38:WU05:FS03:Started FahCore on PID 13340
23:14:38:WU05:FS03:Core PID:13344
23:14:38:WU05:FS03:FahCore 0x18 started
23:14:39:WU05:FS03:0x18:*********************** Log Started 2016-07-17T23:14:38Z ***********************
23:14:39:WU05:FS03:0x18:Project: 9162 (Run 163, Clone 0, Gen 352)
23:14:39:WU05:FS03:0x18:Unit: 0x00000194ab40415c56748147b6c09d56
23:14:39:WU05:FS03:0x18:CPU: 0x00000000000000000000000000000000
23:14:39:WU05:FS03:0x18:Machine: 3
23:14:39:WU05:FS03:0x18:Reading tar file core.xml
23:14:39:WU05:FS03:0x18:Reading tar file system.xml
23:14:39:WU05:FS03:0x18:Reading tar file integrator.xml
23:14:39:WU05:FS03:0x18:Reading tar file state.xml
23:14:39:WU05:FS03:0x18:Digital signatures verified
23:14:39:WU05:FS03:0x18:Folding@home GPU core18
23:14:39:WU05:FS03:0x18:Version 0.0.4
23:14:44:WU00:FS03:Upload 39.45%
23:14:50:WU00:FS03:Upload 76.22%
23:14:51:WU05:FS03:0x18:Completed 0 out of 2500000 steps (0%)
23:14:51:WU05:FS03:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
23:15:04:WU00:FS03:Upload complete
23:15:04:WU00:FS03:Server responded WORK_ACK (400)
23:15:04:WU00:FS03:Final credit estimate, 68114.00 points
23:15:04:WU00:FS03:Cleaning up
23:15:11:WU03:FS01:0x21:Completed 620000 out of 1000000 steps (62%)
23:15:17:WU02:FS00:0xa4:Completed 96000 out of 160000 steps  (60%)
23:16:29:WU05:FS03:0x18:Completed 25000 out of 2500000 steps (1%)
23:16:31:WU02:FS00:0xa4:Completed 97600 out of 160000 steps  (61%)
23:16:39:WU01:FS02:0x21:Completed 440000 out of 2000000 steps (22%)
23:17:41:WU02:FS00:0xa4:Completed 99200 out of 160000 steps  (62%)
23:18:03:WU05:FS03:0x18:Completed 50000 out of 2500000 steps (2%)
23:18:16:WU03:FS01:0x21:Completed 630000 out of 1000000 steps (63%)
23:18:53:WU02:FS00:0xa4:Completed 100800 out of 160000 steps  (63%)
23:19:22:WU01:FS02:0x21:Completed 460000 out of 2000000 steps (23%)
23:19:37:WU05:FS03:0x18:Completed 75000 out of 2500000 steps (3%)
23:20:02:WU02:FS00:0xa4:Completed 102400 out of 160000 steps  (64%)
23:21:11:WU02:FS00:0xa4:Completed 104000 out of 160000 steps  (65%)
23:21:12:WU05:FS03:0x18:Completed 100000 out of 2500000 steps (4%)
23:21:21:WU03:FS01:0x21:Completed 640000 out of 1000000 steps (64%)
23:22:06:WU01:FS02:0x21:Completed 480000 out of 2000000 steps (24%)
23:22:22:WU02:FS00:0xa4:Completed 105600 out of 160000 steps  (66%)
23:22:50:WU05:FS03:0x18:Completed 125000 out of 2500000 steps (5%)
23:23:31:WU02:FS00:0xa4:Completed 107200 out of 160000 steps  (67%)
23:24:24:WU05:FS03:0x18:Completed 150000 out of 2500000 steps (6%)
23:24:25:WU03:FS01:0x21:Completed 650000 out of 1000000 steps (65%)
23:24:41:WU02:FS00:0xa4:Completed 108800 out of 160000 steps  (68%)
23:24:50:WU01:FS02:0x21:Completed 500000 out of 2000000 steps (25%)
23:25:54:WU02:FS00:0xa4:Completed 110400 out of 160000 steps  (69%)
23:25:58:WU05:FS03:0x18:Completed 175000 out of 2500000 steps (7%)
23:27:08:WU02:FS00:0xa4:Completed 112000 out of 160000 steps  (70%)
23:27:32:WU05:FS03:0x18:Completed 200000 out of 2500000 steps (8%)
23:27:40:WU03:FS01:0x21:Completed 660000 out of 1000000 steps (66%)
23:27:40:WU01:FS02:0x21:Completed 520000 out of 2000000 steps (26%)
23:28:20:WU02:FS00:0xa4:Completed 113600 out of 160000 steps  (71%)
23:29:10:WU05:FS03:0x18:Completed 225000 out of 2500000 steps (9%)
23:29:30:WU02:FS00:0xa4:Completed 115200 out of 160000 steps  (72%)
23:30:24:WU01:FS02:0x21:Completed 540000 out of 2000000 steps (27%)
23:30:41:WU02:FS00:0xa4:Completed 116800 out of 160000 steps  (73%)
23:30:45:WU05:FS03:0x18:Completed 250000 out of 2500000 steps (10%)
23:30:45:WU03:FS01:0x21:Completed 670000 out of 1000000 steps (67%)
23:31:52:WU04:FS02:Sending unit results: id:04 state:SEND error:NO_ERROR project:9209 run:41 clone:7 gen:14 core:0x21 unit:0x0000002e664f2dd056fb27a1d1bc4ae7
23:31:52:WU04:FS02:Uploading 37.95MiB to 171.64.65.104
23:31:52:WU04:FS02:Connecting to 171.64.65.104:8080
23:31:53:WARNING:WU04:FS02:WorkServer connection failed on port 8080 trying 80
23:31:53:WU04:FS02:Connecting to 171.64.65.104:80
23:31:53:WU02:FS00:0xa4:Completed 118400 out of 160000 steps  (74%)
23:31:54:WARNING:WU04:FS02:Exception: Failed to send results to work server: Failed to connect to 171.64.65.104:80: Connection refused
23:31:54:WU04:FS02:Trying to send results to collection server
23:31:54:WU04:FS02:Uploading 37.95MiB to 171.65.103.160
23:31:54:WU04:FS02:Connecting to 171.65.103.160:8080
23:32:00:WU04:FS02:Upload 3.13%
23:32:06:WU04:FS02:Upload 8.73%
23:32:12:WU04:FS02:Upload 16.96%
23:32:18:WU04:FS02:Upload 24.71%
23:32:19:WU05:FS03:0x18:Completed 275000 out of 2500000 steps (11%)
23:32:24:WU04:FS02:Upload 33.11%
23:32:30:WU04:FS02:Upload 40.85%
23:32:36:WU04:FS02:Upload 49.25%
23:32:42:WU04:FS02:Upload 56.99%
23:32:48:WU04:FS02:Upload 65.22%
23:32:54:WU04:FS02:Upload 72.96%
23:33:00:WU04:FS02:Upload 81.36%
23:33:06:WU04:FS02:Upload 89.10%
23:33:06:WU02:FS00:0xa4:Completed 120000 out of 160000 steps  (75%)
23:33:07:WU01:FS02:0x21:Completed 560000 out of 2000000 steps (28%)
23:33:12:WU04:FS02:Upload 97.34%
23:33:15:WU04:FS02:Upload complete
23:33:15:WU04:FS02:Server responded PLEASE_WAIT (464)
23:33:15:WARNING:WU04:FS02:Failed to send results, will try again later
23:33:50:WU03:FS01:0x21:Completed 680000 out of 1000000 steps (68%)
23:33:53:WU05:FS03:0x18:Completed 300000 out of 2500000 steps (12%)
23:34:17:WU02:FS00:0xa4:Completed 121600 out of 160000 steps  (76%)
23:35:27:WU02:FS00:0xa4:Completed 123200 out of 160000 steps  (77%)
23:35:31:WU05:FS03:0x18:Completed 325000 out of 2500000 steps (13%)
23:35:51:WU01:FS02:0x21:Completed 580000 out of 2000000 steps (29%)
23:36:37:WU02:FS00:0xa4:Completed 124800 out of 160000 steps  (78%)
23:36:55:WU03:FS01:0x21:Completed 690000 out of 1000000 steps (69%)
23:37:06:WU05:FS03:0x18:Completed 350000 out of 2500000 steps (14%)
23:37:47:WU02:FS00:0xa4:Completed 126400 out of 160000 steps  (79%)
23:38:35:WU01:FS02:0x21:Completed 600000 out of 2000000 steps (30%)
23:38:39:WU05:FS03:0x18:Completed 375000 out of 2500000 steps (15%)
23:38:57:WU02:FS00:0xa4:Completed 128000 out of 160000 steps  (80%)
23:39:59:WU03:FS01:0x21:Completed 700000 out of 1000000 steps (70%)
23:40:10:WU02:FS00:0xa4:Completed 129600 out of 160000 steps  (81%)
23:40:13:WU05:FS03:0x18:Completed 400000 out of 2500000 steps (16%)
23:41:19:WU01:FS02:0x21:Completed 620000 out of 2000000 steps (31%)
23:41:21:WU02:FS00:0xa4:Completed 131200 out of 160000 steps  (82%)
23:41:51:WU05:FS03:0x18:Completed 425000 out of 2500000 steps (17%)
23:42:34:WU02:FS00:0xa4:Completed 132800 out of 160000 steps  (83%)
23:43:16:WU03:FS01:0x21:Completed 710000 out of 1000000 steps (71%)
23:43:25:WU05:FS03:0x18:Completed 450000 out of 2500000 steps (18%)
23:43:46:WU02:FS00:0xa4:Completed 134400 out of 160000 steps  (84%)
23:44:09:WU01:FS02:0x21:Completed 640000 out of 2000000 steps (32%)
23:44:59:WU02:FS00:0xa4:Completed 136000 out of 160000 steps  (85%)
23:44:59:WU05:FS03:0x18:Completed 475000 out of 2500000 steps (19%)
23:46:11:WU02:FS00:0xa4:Completed 137600 out of 160000 steps  (86%)
I notice it's saying that 171.64.65.104, which is the work server involved. It appears to be trying to send the results back to the work server instead of the collection server, which is the number 171.65.103.160 listed above. Is that right? Shouldn't it go back to the collection server? I just checked, and the same thing is happening on the other system. Work server is 171.64.65.104, with collection server 171.65.103.160.

Mod edit: added Code tags to log, and merged with existing topic on WS 171.64.65.104 - j
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Failed to connect to 171.64.65.104:80

Post by Joe_H »

All Wu's first are returned by preference to the WS that assigned them. Only if the return fails to the WS is the CS used. Many projects do not have a designated CS, the only return to the WS. Those will show a 0.0.0.0 address in the CS field.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Failed to connect to 171.64.65.104:80

Post by 7im »

Is that server message PLEASE WAIT a new type of message?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
PS3EdOlkkola
Posts: 184
Joined: Tue Aug 26, 2014 9:48 pm
Hardware configuration: 10 SMP folding slots on Intel Phi "Knights Landing" system, configured as 24 CPUs/slot
9 AMD GPU folding slots
31 Nvidia GPU folding slots
50 total folding slots
Average PPD/slot = 459,500
Location: Dallas, TX

Re: Failed to connect to 171.64.65.104:80

Post by PS3EdOlkkola »

I now have 6 rigs unable to connect to the proper collection server. The collection server each of my rigs are supposed to connect to has a Status of "full" and a Connect condition of "reject". All of these are big work units, with points between 150,000 and 175,000 each. Jadeshi, who is responsible for the full servers, needs to get their act together and free up some space to accept work units. Unless he/she does it soon, these processed work units will time out and the whole project will be set back, not the least of which is over a million points in one day that gets flushed.

Seriously, is it really that hard to monitor your own servers to ensure they are configured correctly and maintained? Yes, I'm getting annoyed, and rightly so.
Image
Hardware config viewtopic.php?f=66&t=17997&p=277235#p277235
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Failed to connect to 171.64.65.104:80

Post by bruce »

I'd be annoyed, too, but Stanford has never had very good coverage on weekends. I don't know how many problems I've seen that were fixed on a Monday morning. I do see a lot of servers which have a status of either DOWN or Reject.

The Collection Servers can only recover a certain percentage of WUs associated with a Work Server that' has failed (and that often results in a PLEASE WAIT message). My hunch is that the problem with the CS will go away once the WS is back on-line.
Simplex0
Posts: 69
Joined: Sun Oct 06, 2013 10:35 am

Re: Failed to connect to 171.64.65.104:80

Post by Simplex0 »

Same problem here

"

Code: Select all

08:03:22:WU00:FS02:Uploading 37.94MiB to 171.64.65.104
08:03:22:WU00:FS02:Connecting to 171.64.65.104:8080
08:03:24:WARNING:WU00:FS02:WorkServer connection failed on port 8080 trying 80
08:03:24:WU00:FS02:Connecting to 171.64.65.104:80
08:03:32:WU00:FS02:Upload 2.31%
08:03:38:WU00:FS02:Upload 4.12%
08:03:44:WU00:FS02:Upload 5.93%
08:03:50:WU00:FS02:Upload 7.91%
08:03:56:WU00:FS02:Upload 9.72%
08:04:02:WU00:FS02:Upload 11.53%
08:04:08:WU00:FS02:Upload 13.34%
08:04:14:WU00:FS02:Upload 15.15%
08:04:20:WU00:FS02:Upload 17.13%
08:04:26:WU00:FS02:Upload 18.94%
08:04:26:WU04:FS01:0x21:Completed 12800 out of 640000 steps (2%)
08:04:32:WU00:FS02:Upload 20.76%
08:04:38:WU00:FS02:Upload 22.57%
08:04:44:WU00:FS02:Upload 24.38%
08:04:50:WU00:FS02:Upload 26.36%
08:04:56:WU00:FS02:Upload 27.84%
08:05:02:WU00:FS02:Upload 29.65%
08:05:08:WU00:FS02:Upload 31.46%
08:05:08:WU01:FS02:0x21:Completed 1400000 out of 2500000 steps (56%)
08:05:14:WU00:FS02:Upload 33.27%
08:05:20:WU00:FS02:Upload 35.25%
08:05:26:WU00:FS02:Upload 37.06%
08:05:32:WU00:FS02:Upload 38.87%
08:05:38:WU00:FS02:Upload 40.69%
08:05:44:WU00:FS02:Upload 42.50%
08:05:50:WU00:FS02:Upload 44.48%
08:05:56:WU00:FS02:Upload 46.12%
08:06:02:WU00:FS02:Upload 48.10%
08:06:08:WU00:FS02:Upload 49.91%
08:06:14:WU00:FS02:Upload 51.72%
08:06:20:WU00:FS02:Upload 53.54%
08:06:26:WU00:FS02:Upload 55.51%
08:06:32:WU00:FS02:Upload 57.32%
08:06:38:WU00:FS02:Upload 59.14%
08:06:44:WU00:FS02:Upload 60.95%
08:06:48:WU04:FS01:0x21:Completed 19200 out of 640000 steps (3%)
08:06:50:WU00:FS02:Upload 62.92%
08:06:56:WU00:FS02:Upload 64.74%
08:07:02:WU00:FS02:Upload 66.55%
08:07:08:WU00:FS02:Upload 68.36%
08:07:14:WU00:FS02:Upload 70.17%
08:07:21:WU00:FS02:Upload 72.15%
08:07:27:WU00:FS02:Upload 74.29%
08:07:33:WU00:FS02:Upload 76.10%
08:07:38:WU01:FS02:0x21:Completed 1425000 out of 2500000 steps (57%)
08:07:39:WU00:FS02:Upload 77.91%
08:07:45:WU00:FS02:Upload 79.73%
08:07:51:WU00:FS02:Upload 81.54%
08:07:57:WU00:FS02:Upload 83.35%
08:08:03:WU00:FS02:Upload 85.33%
08:08:09:WU00:FS02:Upload 87.14%
08:08:15:WU00:FS02:Upload 88.95%
08:08:21:WU00:FS02:Upload 90.76%
08:08:27:WU00:FS02:Upload 92.57%
08:08:33:WU00:FS02:Upload 94.55%
08:08:39:WU00:FS02:Upload 96.36%
08:08:45:WU00:FS02:Upload 98.01%
08:08:51:WU00:FS02:Upload 99.82%
08:08:52:WU00:FS02:Upload complete
08:08:52:WU00:FS02:Server responded PLEASE_WAIT (464)
08:08:52:WARNING:WU00:FS02:Failed to send results, will try again later
"

Mode edit: added Code tags to log file
tofuwombat
Posts: 19
Joined: Mon Nov 22, 2010 4:06 pm

Re: Failed to connect to 171.64.65.104:80

Post by tofuwombat »

bruce wrote:I'd be annoyed, too, but Stanford has never had very good coverage on weekends. I don't know how many problems I've seen that were fixed on a Monday morning. I do see a lot of servers which have a status of either DOWN or Reject.

The Collection Servers can only recover a certain percentage of WUs associated with a Work Server that' has failed (and that often results in a PLEASE WAIT message). My hunch is that the problem with the CS will go away once the WS is back on-line.
Having similar "PLEASE WAIT" issue.

Thanks for this insight.

I will wait.
Simplex0
Posts: 69
Joined: Sun Oct 06, 2013 10:35 am

Re: Failed to connect to 171.64.65.104:80

Post by Simplex0 »

Does this means that the servers hard drives is full?
"
Mon Jul 18 09:00:25 PDT 2016 171.64.65.104 vspg14b jadeshi GPU full Reject 0.00 0 0 50541 9913 -1563 0 0 0 - - - - - 0 0 - - 1 - 0 0 WL; WL; 10000, 10000 7.0, 7.0 - 49, 49 64, 64 - - 2, 1 B, B 8080G, 8080G
"

http://fah-web.stanford.edu/pybeta/logs ... 4.log.html
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Failed to connect to 171.64.65.104:80

Post by Joe_H »

Maybe. or it might be another problem with the WS. The "full" in that status line refers to whether the server will both assign and receive WU's, the "Reject" is that currently it was not accepting connections but should be. Currently the status is "standby" and "Not accept".
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Post Reply