Send Errors - 155.247.164.213 & .214

Moderators: Site Moderators, FAHC Science Team

Re: Send Errors - 155.247.164.213 & .214

Postby sswilson » Wed Mar 25, 2020 8:51 pm

alxbelu wrote:
sswilson wrote:My .214 unit cleared... haven't gone back through the log yet to see if it uploaded or timed out.....

edit: Pretty sure I tracked it down in the log.... it took a total of 36 minutes to upload the results in .15 - .50% increments which would suggest the server was being absolutely hammered with uploads..... :)

Took up a total of 276 individual log entries to complete.

High praise to the techs who finally got this resolved. :)


What was the PRCG? I checked the one you mentioned in this thread (11758 (0, 2135, 0)), which seems to still not have been successfully uploaded: https://apps.foldingathome.org/wu#proje ... 2135&gen=0


I'll have a closer look after supper (and maybe upload the whole log file).... it didn't seem to be logged in the same fashion as it normally would (stopped listing the attempt to connect to .214 and then only referred to the WU for several attempts before it seemed to upload).

I don't think there's a way for me to grab the actual WU now that it's no longer showing on the client?
sswilson
 
Posts: 90
Joined: Mon Dec 17, 2007 1:34 am
Location: Moncton, New Brunswick, Canada

Re: Send Errors - 155.247.164.213 & .214

Postby alxbelu » Wed Mar 25, 2020 9:01 pm

AFAIK the log should read something like this:
Code: Select all
19:44:09:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11762 run:0 clone:9927 gen:12 core:0x22 unit:0x0000001080fccb0a5e7113ba81c3f381
19:44:09:WU00:FS01:Uploading 33.02MiB to 128.252.203.10
19:44:09:WU00:FS01:Connecting to 128.252.203.10:8080
19:44:33:WU00:FS01:Upload 1.14%
19:44:40:WU00:FS01:Upload 29.72%
19:44:46:WU00:FS01:Upload complete
19:44:46:WU00:FS01:Server responded WORK_ACK (400)


PRCG is in this case, for FoldingSlot01 and WorkUnit00: 11762, 0,9927,12
The "WORK_ACK" concerns FS01 and WU00, hence I know which specific project was successfully uploaded to which server.
Official F@H Twitter (frequently updated): https://twitter.com/foldingathome
Official F@H Facebook: https://www.facebook.com/Foldinghome-136059519794607/

(I'm not affiliated with the F@H Team, just promoting these channels for official updates)
alxbelu
 
Posts: 109
Joined: Sat Mar 14, 2020 7:28 pm

Re: Send Errors - 155.247.164.213 & .214

Postby sswilson » Wed Mar 25, 2020 9:45 pm

Code: Select all
15:15:20:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:2135 gen:0 core:0x22 unit:0x000000019bf7a4d55e6d7715c140c08d
15:15:20:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
15:15:20:WU01:FS01:Connecting to 155.247.164.213:8080
15:15:20:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
15:15:20:WU01:FS01:Trying to send results to collection server
15:15:20:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
15:15:20:WU01:FS01:Connecting to 155.247.164.214:8080
15:15:20:ERROR:WU01:FS01:Exception: Transfer failed


If I'm reading this right.... this was one of the early attempts to upload it in this log (kinda confused as to why It's listing .213 first and then switching to .214 before it claims the transfer failed)

Code: Select all
17:15:43:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:2135 gen:0 core:0x22 unit:0x000000019bf7a4d55e6d7715c140c08d
17:15:43:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
17:15:43:WU01:FS01:Connecting to 155.247.164.213:8080
17:15:43:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
17:15:43:WU01:FS01:Trying to send results to collection server
17:15:43:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
17:15:43:WU01:FS01:Connecting to 155.247.164.214:8080
17:15:43:ERROR:WU01:FS01:Exception: Transfer failed


This is the last reference of uploading to .214 or .213 that I'm finding in the log

Oops.... yeah found this....

Code: Select all
18:16:00:WARNING:WU01:FS01:Past final deadline 2020-03-24T18:15:59Z, dumping
18:16:00:WU01:FS01:Cleaning up


I guess that's where it dumped it and I was looking at the wrong WU that took forever to upload thinking it was the orphaned one.

Either way... it's gone. :)
sswilson
 
Posts: 90
Joined: Mon Dec 17, 2007 1:34 am
Location: Moncton, New Brunswick, Canada

Re: Send Errors - 155.247.164.213 & .214

Postby evilgizmo2352 » Wed Mar 25, 2020 10:30 pm

Did a little digging and found that both 155.247.164.213 and 214 are set to ASSIGN. Or so it says on the F@H Server list page. I have no idea how to let them know. If anybody knows how to contact voelz at Temple edu. , please do.
evilgizmo2352
 
Posts: 1
Joined: Wed Mar 25, 2020 10:16 pm

Re: Send Errors - 155.247.164.213 & .214

Postby EHoops » Wed Mar 25, 2020 11:52 pm

I am also having this issue. Is there anything to do besides keep retrying the upload? It also seems like this is affecting multiple people, is there a less busy time for sending? I could pause until then.
EHoops
 
Posts: 1
Joined: Wed Mar 25, 2020 11:49 pm

Re: Send Errors - 155.247.164.213 & .214

Postby _r2w_ben » Thu Mar 26, 2020 12:38 am

davidcoton wrote:Unreturned WUs should not be reassigned until they expire. Faulty WUs that are returned are reassigned, up to (IIRC) three failures.
If there is clear evidence that unreturned WUs are being reassigned before they expire, we need to collect enough info to determine the extent/pattern of this problem and then get the team further involved.

Work units can be reassigned when the timeout is reached in order to keep the project moving forward. If the first assigned machine finishes after the timeout, but before the expiry, then it will still be rewarded points.The second assigned machine will also be rewarded points upon completion.

Whichever machine returns the result first will result in the creation of the next work unit with the same PRC but with Gen + 1.
_r2w_ben
 
Posts: 277
Joined: Wed Apr 23, 2008 4:11 pm

Re: Send Errors - 155.247.164.213 & .214

Postby AEM » Thu Mar 26, 2020 8:12 am

I just saw mine disappear. I think it finally got sent.
AEM
 
Posts: 6
Joined: Sun Mar 15, 2020 6:57 pm

Re: Send Errors - 155.247.164.213 & .214

Postby Darth_Peter_dualxeon » Thu Mar 26, 2020 9:01 am

Yes, it was finally fixed. My two workunits finally uploaded at today, 06:45 (in the timezone of Fahclient)
Darth_Peter_dualxeon
 
Posts: 46
Joined: Fri Mar 20, 2020 4:13 am

Re: Send Errors - 155.247.164.213 & .214

Postby qk7b » Thu Mar 26, 2020 9:13 am

qk7b wrote:Same error here, just finished a WU for 11758 and got exact the same output
155.247.164.214 seems in real pain, good luck guys.

As said earlier by alxbelu, looks like we're more than one on this specific PRCG (I don't quite understand how it works for now sorry)

Mine is PRCG 11758 (0, 526, 0)

Looks like there are already 2 Faulty WU completed, assigned on 2020-03-23 where mine was on 2020-03-20 .
At this point I don't know if I should let it go, or if I should clear it in some way.

The expiration date is on 2020-03-28 anyway, so I can still wait till this date.


EDIT: Added info on PRCG


Same for me here, everything has been sent in the night !

Great work !
Image
qk7b
 
Posts: 4
Joined: Sat Mar 21, 2020 1:12 pm
Location: Lille, France

Re: Send Errors - 155.247.164.213 & .214

Postby sc00p » Thu Mar 26, 2020 9:27 am

Oh I got one WU stuck on my client now for a while... checked the WU status PRCG 11758 (0,2822,0) = "Not found".

Should I come to a conclusion the specific project has been omitted or what?
sc00p
 
Posts: 4
Joined: Sat Jan 26, 2008 10:04 pm

Re: Send Errors - 155.247.164.213 & .214

Postby scerbera » Thu Mar 26, 2020 10:29 am

Mine are being sent now, great news!
scerbera
 
Posts: 34
Joined: Fri Mar 13, 2020 12:38 am

Re: Send Errors - 155.247.164.213 & .214

Postby vangli » Thu Mar 26, 2020 11:12 am

Mine three are also uploading. See that software version on servers has changed from 9.5.6 to 9.6.
Regards
Bent Vangli, Oslo, Norway
vangli
 
Posts: 12
Joined: Thu Mar 19, 2020 11:35 am

Re: Send Errors - 155.247.164.213 & .214

Postby pachydermus » Thu Mar 26, 2020 12:31 pm

Uploaded here. Instead my machines have been waiting 12 hours for a new work unit to process. fantastic. :x
pachydermus
 
Posts: 17
Joined: Tue Mar 24, 2020 12:06 pm

Re: Send Errors - 155.247.164.213 & .214

Postby sc00p » Thu Mar 26, 2020 1:44 pm

vangli wrote:See that software version on servers has changed from 9.5.6 to 9.6.

Nice find there^ :)


And my WU has gone too it seems...
sc00p
 
Posts: 4
Joined: Sat Jan 26, 2008 10:04 pm

Re: Send Errors - 155.247.164.213 & .214

Postby alxbelu » Thu Mar 26, 2020 6:07 pm

sc00p wrote:Oh I got one WU stuck on my client now for a while... checked the WU status PRCG 11758 (0,2822,0) = "Not found".

Should I come to a conclusion the specific project has been omitted or what?


Not sure, but it looks the same for the one WU I had left: https://apps.foldingathome.org/wu#proje ... 1460&gen=0

Code: Select all
07:17:58:WU03:FS01:Sending unit results: id:03 state:SEND error:NO_ERROR project:11758 run:0 clone:1460 gen:0 core:0x22 unit:0x000000069bf7a4d55e6d77136df445dd
07:17:58:WU03:FS01:Uploading 55.24MiB to 155.247.164.213
07:17:58:WU03:FS01:Connecting to 155.247.164.213:8080
07:18:20:WARNING:WU03:FS01:WorkServer connection failed on port 8080 trying 80
07:18:20:WU03:FS01:Connecting to 155.247.164.213:80
07:18:41:WARNING:WU03:FS01:Exception: Failed to send results to work server: Failed to connect to 155.247.164.213:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
07:18:41:WU03:FS01:Trying to send results to collection server
07:18:41:WU03:FS01:Uploading 55.24MiB to 155.247.164.214
07:18:41:WU03:FS01:Connecting to 155.247.164.214:8080
07:18:47:WU03:FS01:Upload 4.98%
07:18:53:WU03:FS01:Upload 10.41%
07:18:59:WU03:FS01:Upload 15.95%
07:19:05:WU03:FS01:Upload 21.38%
07:19:11:WU03:FS01:Upload 26.70%
07:19:17:WU03:FS01:Upload 32.13%
07:19:23:WU03:FS01:Upload 37.45%
07:19:29:WU03:FS01:Upload 42.54%
07:19:35:WU03:FS01:Upload 47.86%
07:19:41:WU03:FS01:Upload 52.73%
07:19:47:WU03:FS01:Upload 58.04%
07:19:53:WU03:FS01:Upload 63.36%
07:19:59:WU03:FS01:Upload 68.79%
07:20:05:WU03:FS01:Upload 74.11%
07:20:11:WU03:FS01:Upload 79.54%
07:20:17:WU03:FS01:Upload 84.97%
07:20:23:WU03:FS01:Upload 90.29%
07:20:29:WU03:FS01:Upload 95.72%
07:20:34:WU03:FS01:Upload complete
07:20:34:WU03:FS01:Server responded WORK_ACK (400)
07:20:34:WU03:FS01:Final credit estimate, 16615.00 points


I guess there could be a delay, but other WUs I've submitted since show up correctly (e.g. PRCG 11759 (0,790,16)). Perhaps they allowed submissions that did not match the WS assignment db, to manually handle them at some point? In any case, happy that it somewhat seems to have been resolved :)
alxbelu
 
Posts: 109
Joined: Sat Mar 14, 2020 7:28 pm

PreviousNext

Return to Issues with a specific server

Who is online

Users browsing this forum: Yandex [Bot] and 5 guests

cron