Send Errors - 155.247.164.213 & .214

Moderators: Site Moderators, FAHC Science Team

sswilson
Posts: 90
Joined: Mon Dec 17, 2007 12:34 am
Hardware configuration: ASUS Crosshair IV Formula / AMD 1090T / 4X2 Gig GSkill Pi PC3-12800 / Corsair TX750W PSU / Sparkle GTX275 Plus / CoolerMaster Cosmos S / MCP655 WC Pump / MCR320 Rad / 6X Yate Loons / PA120.1 / 2X Scythe Ultra Kaze / Enzotech Luna WB / Dell Ultrasharp 2209WA

Gigabyte P35-DQ6 / Q6600 / 2X 1G 1066 Firestix / "Baked" XFX GTX 280 (RIP again :( ) / MSI GTS 450 Cyclone OC /PC P&C 750W Silencer / MCR220-QP-Res / DD DDCPX-Pro / Apogee GT / Highspeed PC Tech Station / Samsung 931BF / BenQ Q9T4
Location: Moncton, New Brunswick, Canada

Re: Send Errors - 155.247.164.213 & .214

Post by sswilson »

alxbelu wrote:
sswilson wrote:My .214 unit cleared... haven't gone back through the log yet to see if it uploaded or timed out.....

edit: Pretty sure I tracked it down in the log.... it took a total of 36 minutes to upload the results in .15 - .50% increments which would suggest the server was being absolutely hammered with uploads..... :)

Took up a total of 276 individual log entries to complete.

High praise to the techs who finally got this resolved. :)
What was the PRCG? I checked the one you mentioned in this thread (11758 (0, 2135, 0)), which seems to still not have been successfully uploaded: https://apps.foldingathome.org/wu#proje ... 2135&gen=0
I'll have a closer look after supper (and maybe upload the whole log file).... it didn't seem to be logged in the same fashion as it normally would (stopped listing the attempt to connect to .214 and then only referred to the WU for several attempts before it seemed to upload).

I don't think there's a way for me to grab the actual WU now that it's no longer showing on the client?
alxbelu
Posts: 109
Joined: Sat Mar 14, 2020 6:28 pm

Re: Send Errors - 155.247.164.213 & .214

Post by alxbelu »

AFAIK the log should read something like this:

Code: Select all

19:44:09:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11762 run:0 clone:9927 gen:12 core:0x22 unit:0x0000001080fccb0a5e7113ba81c3f381
19:44:09:WU00:FS01:Uploading 33.02MiB to 128.252.203.10
19:44:09:WU00:FS01:Connecting to 128.252.203.10:8080
19:44:33:WU00:FS01:Upload 1.14%
19:44:40:WU00:FS01:Upload 29.72%
19:44:46:WU00:FS01:Upload complete
19:44:46:WU00:FS01:Server responded WORK_ACK (400)
PRCG is in this case, for FoldingSlot01 and WorkUnit00: 11762, 0,9927,12
The "WORK_ACK" concerns FS01 and WU00, hence I know which specific project was successfully uploaded to which server.
Official F@H Twitter (frequently updated): https://twitter.com/foldingathome
Official F@H Facebook: https://www.facebook.com/Foldinghome-136059519794607/

(I'm not affiliated with the F@H Team, just promoting these channels for official updates)
sswilson
Posts: 90
Joined: Mon Dec 17, 2007 12:34 am
Hardware configuration: ASUS Crosshair IV Formula / AMD 1090T / 4X2 Gig GSkill Pi PC3-12800 / Corsair TX750W PSU / Sparkle GTX275 Plus / CoolerMaster Cosmos S / MCP655 WC Pump / MCR320 Rad / 6X Yate Loons / PA120.1 / 2X Scythe Ultra Kaze / Enzotech Luna WB / Dell Ultrasharp 2209WA

Gigabyte P35-DQ6 / Q6600 / 2X 1G 1066 Firestix / "Baked" XFX GTX 280 (RIP again :( ) / MSI GTS 450 Cyclone OC /PC P&C 750W Silencer / MCR220-QP-Res / DD DDCPX-Pro / Apogee GT / Highspeed PC Tech Station / Samsung 931BF / BenQ Q9T4
Location: Moncton, New Brunswick, Canada

Re: Send Errors - 155.247.164.213 & .214

Post by sswilson »

Code: Select all

15:15:20:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:2135 gen:0 core:0x22 unit:0x000000019bf7a4d55e6d7715c140c08d
15:15:20:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
15:15:20:WU01:FS01:Connecting to 155.247.164.213:8080
15:15:20:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
15:15:20:WU01:FS01:Trying to send results to collection server
15:15:20:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
15:15:20:WU01:FS01:Connecting to 155.247.164.214:8080
15:15:20:ERROR:WU01:FS01:Exception: Transfer failed
If I'm reading this right.... this was one of the early attempts to upload it in this log (kinda confused as to why It's listing .213 first and then switching to .214 before it claims the transfer failed)

Code: Select all

17:15:43:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:2135 gen:0 core:0x22 unit:0x000000019bf7a4d55e6d7715c140c08d
17:15:43:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
17:15:43:WU01:FS01:Connecting to 155.247.164.213:8080
17:15:43:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
17:15:43:WU01:FS01:Trying to send results to collection server
17:15:43:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
17:15:43:WU01:FS01:Connecting to 155.247.164.214:8080
17:15:43:ERROR:WU01:FS01:Exception: Transfer failed
This is the last reference of uploading to .214 or .213 that I'm finding in the log

Oops.... yeah found this....

Code: Select all

18:16:00:WARNING:WU01:FS01:Past final deadline 2020-03-24T18:15:59Z, dumping
18:16:00:WU01:FS01:Cleaning up
I guess that's where it dumped it and I was looking at the wrong WU that took forever to upload thinking it was the orphaned one.

Either way... it's gone. :)
evilgizmo2352
Posts: 1
Joined: Wed Mar 25, 2020 9:16 pm

Re: Send Errors - 155.247.164.213 & .214

Post by evilgizmo2352 »

Did a little digging and found that both 155.247.164.213 and 214 are set to ASSIGN. Or so it says on the F@H Server list page. I have no idea how to let them know. If anybody knows how to contact voelz at Temple edu. , please do.
EHoops
Posts: 1
Joined: Wed Mar 25, 2020 10:49 pm

Re: Send Errors - 155.247.164.213 & .214

Post by EHoops »

I am also having this issue. Is there anything to do besides keep retrying the upload? It also seems like this is affecting multiple people, is there a less busy time for sending? I could pause until then.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: Send Errors - 155.247.164.213 & .214

Post by _r2w_ben »

davidcoton wrote:Unreturned WUs should not be reassigned until they expire. Faulty WUs that are returned are reassigned, up to (IIRC) three failures.
If there is clear evidence that unreturned WUs are being reassigned before they expire, we need to collect enough info to determine the extent/pattern of this problem and then get the team further involved.
Work units can be reassigned when the timeout is reached in order to keep the project moving forward. If the first assigned machine finishes after the timeout, but before the expiry, then it will still be rewarded points.The second assigned machine will also be rewarded points upon completion.

Whichever machine returns the result first will result in the creation of the next work unit with the same PRC but with Gen + 1.
AEM
Posts: 6
Joined: Sun Mar 15, 2020 5:57 pm

Re: Send Errors - 155.247.164.213 & .214

Post by AEM »

I just saw mine disappear. I think it finally got sent.
Darth_Peter_dualxeon
Posts: 51
Joined: Fri Mar 20, 2020 3:13 am
Hardware configuration: EVGA SR-2 motherboard
2x Xeon x5670 CPU
64 GB ECC DDR3
Nvidia RTX 2070

Re: Send Errors - 155.247.164.213 & .214

Post by Darth_Peter_dualxeon »

Yes, it was finally fixed. My two workunits finally uploaded at today, 06:45 (in the timezone of Fahclient)
qk7b
Posts: 4
Joined: Sat Mar 21, 2020 12:12 pm
Hardware configuration: CPU: Intel (R) Core(TM) i5-9400F CPU @ 2.90GHz
CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
Memory: 15.93GiB
GPU: nVidia GeForce GTX 1660 TI
Location: Lille, France

Re: Send Errors - 155.247.164.213 & .214

Post by qk7b »

qk7b wrote:Same error here, just finished a WU for 11758 and got exact the same output
155.247.164.214 seems in real pain, good luck guys.

As said earlier by alxbelu, looks like we're more than one on this specific PRCG (I don't quite understand how it works for now sorry)

Mine is PRCG 11758 (0, 526, 0)

Looks like there are already 2 Faulty WU completed, assigned on 2020-03-23 where mine was on 2020-03-20 .
At this point I don't know if I should let it go, or if I should clear it in some way.

The expiration date is on 2020-03-28 anyway, so I can still wait till this date.


EDIT: Added info on PRCG
Same for me here, everything has been sent in the night !

Great work !
Image
sc00p
Posts: 4
Joined: Sat Jan 26, 2008 9:04 pm

Re: Send Errors - 155.247.164.213 & .214

Post by sc00p »

Oh I got one WU stuck on my client now for a while... checked the WU status PRCG 11758 (0,2822,0) = "Not found".

Should I come to a conclusion the specific project has been omitted or what?
scerbera
Posts: 34
Joined: Thu Mar 12, 2020 11:38 pm

Re: Send Errors - 155.247.164.213 & .214

Post by scerbera »

Mine are being sent now, great news!
vangli
Posts: 12
Joined: Thu Mar 19, 2020 10:35 am

Re: Send Errors - 155.247.164.213 & .214

Post by vangli »

Mine three are also uploading. See that software version on servers has changed from 9.5.6 to 9.6.
Regards
Bent Vangli, Oslo, Norway
pachydermus
Posts: 17
Joined: Tue Mar 24, 2020 11:06 am

Re: Send Errors - 155.247.164.213 & .214

Post by pachydermus »

Uploaded here. Instead my machines have been waiting 12 hours for a new work unit to process. fantastic. :x
sc00p
Posts: 4
Joined: Sat Jan 26, 2008 9:04 pm

Re: Send Errors - 155.247.164.213 & .214

Post by sc00p »

vangli wrote:See that software version on servers has changed from 9.5.6 to 9.6.
Nice find there^ :)


And my WU has gone too it seems...
alxbelu
Posts: 109
Joined: Sat Mar 14, 2020 6:28 pm

Re: Send Errors - 155.247.164.213 & .214

Post by alxbelu »

sc00p wrote:Oh I got one WU stuck on my client now for a while... checked the WU status PRCG 11758 (0,2822,0) = "Not found".

Should I come to a conclusion the specific project has been omitted or what?
Not sure, but it looks the same for the one WU I had left: https://apps.foldingathome.org/wu#proje ... 1460&gen=0

Code: Select all

07:17:58:WU03:FS01:Sending unit results: id:03 state:SEND error:NO_ERROR project:11758 run:0 clone:1460 gen:0 core:0x22 unit:0x000000069bf7a4d55e6d77136df445dd
07:17:58:WU03:FS01:Uploading 55.24MiB to 155.247.164.213
07:17:58:WU03:FS01:Connecting to 155.247.164.213:8080
07:18:20:WARNING:WU03:FS01:WorkServer connection failed on port 8080 trying 80
07:18:20:WU03:FS01:Connecting to 155.247.164.213:80
07:18:41:WARNING:WU03:FS01:Exception: Failed to send results to work server: Failed to connect to 155.247.164.213:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
07:18:41:WU03:FS01:Trying to send results to collection server
07:18:41:WU03:FS01:Uploading 55.24MiB to 155.247.164.214
07:18:41:WU03:FS01:Connecting to 155.247.164.214:8080
07:18:47:WU03:FS01:Upload 4.98%
07:18:53:WU03:FS01:Upload 10.41%
07:18:59:WU03:FS01:Upload 15.95%
07:19:05:WU03:FS01:Upload 21.38%
07:19:11:WU03:FS01:Upload 26.70%
07:19:17:WU03:FS01:Upload 32.13%
07:19:23:WU03:FS01:Upload 37.45%
07:19:29:WU03:FS01:Upload 42.54%
07:19:35:WU03:FS01:Upload 47.86%
07:19:41:WU03:FS01:Upload 52.73%
07:19:47:WU03:FS01:Upload 58.04%
07:19:53:WU03:FS01:Upload 63.36%
07:19:59:WU03:FS01:Upload 68.79%
07:20:05:WU03:FS01:Upload 74.11%
07:20:11:WU03:FS01:Upload 79.54%
07:20:17:WU03:FS01:Upload 84.97%
07:20:23:WU03:FS01:Upload 90.29%
07:20:29:WU03:FS01:Upload 95.72%
07:20:34:WU03:FS01:Upload complete
07:20:34:WU03:FS01:Server responded WORK_ACK (400)
07:20:34:WU03:FS01:Final credit estimate, 16615.00 points
I guess there could be a delay, but other WUs I've submitted since show up correctly (e.g. PRCG 11759 (0,790,16)). Perhaps they allowed submissions that did not match the WS assignment db, to manually handle them at some point? In any case, happy that it somewhat seems to have been resolved :)
Official F@H Twitter (frequently updated): https://twitter.com/foldingathome
Official F@H Facebook: https://www.facebook.com/Foldinghome-136059519794607/

(I'm not affiliated with the F@H Team, just promoting these channels for official updates)
Post Reply