Page 1 of 3

failure to upload 206.223.170.146

Posted: Fri Nov 20, 2020 7:19 pm
by verdeva
Things have been going great for the past couple of weeks, but this just cropped up today:

Code: Select all

18:47:17:WU01:FS00:0xa7:Completed 242500 out of 250000 steps (97%)
18:48:04:WU01:FS00:0xa7:Completed 245000 out of 250000 steps (98%)
18:48:50:WU01:FS00:0xa7:Completed 247500 out of 250000 steps (99%)
18:49:37:WU01:FS00:0xa7:Completed 250000 out of 250000 steps (100%)
18:49:38:WU01:FS00:0xa7:Saving result file ../logfile_01.txt
18:49:38:WU01:FS00:0xa7:Saving result file frame208.trr
18:49:38:WU01:FS00:0xa7:Saving result file frame208.xtc
18:49:38:WU01:FS00:0xa7:Saving result file md.log
18:49:38:WU01:FS00:0xa7:Saving result file science.log
18:49:38:WU01:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
18:49:38:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
18:49:38:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
18:49:38:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
18:49:38:WU01:FS00:Connecting to 206.223.170.146:8080
18:51:48:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
18:51:48:WU01:FS00:Connecting to 206.223.170.146:80
18:53:59:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out
18:53:59:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
18:53:59:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
18:53:59:WU01:FS00:Connecting to 206.223.170.146:8080
18:56:10:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
18:56:10:WU01:FS00:Connecting to 206.223.170.146:80
18:58:21:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out
18:58:21:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
18:58:21:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
18:58:21:WU01:FS00:Connecting to 206.223.170.146:8080
19:00:32:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:00:32:WU01:FS00:Connecting to 206.223.170.146:80
19:02:43:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out
19:02:44:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
19:02:44:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
19:02:44:WU01:FS00:Connecting to 206.223.170.146:8080
19:04:54:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:04:54:WU01:FS00:Connecting to 206.223.170.146:80
19:07:05:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out
19:07:06:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
19:07:06:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
19:07:06:WU01:FS00:Connecting to 206.223.170.146:8080
19:09:16:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:09:16:WU01:FS00:Connecting to 206.223.170.146:80
19:11:28:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out
19:11:28:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:14255 run:0 clone:2047 gen:208 core:0xa7 unit:0x000000f3cedfaa9200000000000007ff
19:11:28:WU01:FS00:Uploading 2.85MiB to 206.223.170.146
19:11:28:WU01:FS00:Connecting to 206.223.170.146:8080
19:13:39:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
19:13:39:WU01:FS00:Connecting to 206.223.170.146:80
19:15:50:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 206.223.170.146:80: Connection timed out

Re: failure to upload 206.223.170.146

Posted: Fri Nov 20, 2020 7:24 pm
by Neil-B
That server is currently showing as down on server status page ... the client will periodically try to upload the wu results until the server is up again and the upload can be recieved

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 2:37 am
by Badsinger
Its been 24 attempts now to upload for me. Out of curiosity, should a second server fail, as the gpu continues to fold, will it save both tasks or is one lost?

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 9:49 am
by Neil-B
It will save both WU .. it will keep retrying until either it uploads or until the expiration deadline is reached at which point the client will dump the wu .. hopefully the server will be up and receiving before then but it kind of depends why the server is down and it is the weekend so fixes tend to be slower

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 2:49 pm
by psaam0001
I also have a WU needing to be returned as completed, but I'm not going to sweat it.

Stay calm... Keep folding!!!

Paul

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 2:54 pm
by Neil-B
I have seen a post on discord which confirms the server is under maintenance .. it is simply a matter of waiting .. there is nothing you can do to get it to upload until the server is back up

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 3:29 pm
by MoelTryfan
They'd better get a move on. My WU expires at 1826 today.

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 5:22 pm
by Gnomuz
Same here with a WU (Project: 14254 (Run 0, Clone 2459, Gen 172)) finished yesterday at 11/20/2020 19:11. F@H has been retrying to upload the results for more 23 hours now.
Folding keeps on running and uploading to other servers, so let's be patient. But I admit it's easier for me as the expiration is on 11/27/2020 18:17, which gives them more time to fix the issue than MoelTryfan :wink:

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 5:51 pm
by ViTe
I have the same issue and my WU pretty close to time out. I think to have a different collection server should be mandatory.

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 6:19 pm
by Neil-B
It doesn't always work that way and isnt always possible or actually the best thing for the science .. reaching timeout isnt when the client dumps wu it is the expiration deadline .. CS can sometimes cause more issues than benefits

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 8:33 pm
by PFM
Any ETA on this ?
uploads failing for me too on this server....

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 8:52 pm
by Neil-B
unfortunately fah doesn't tend to do etas .. rest assured they will get it up as quick as they can .. but it is the weekend and that does slow things down especially with the current pandemic .. the researchers will be even more gutted than the folders - but sometimes these things just happen

Re: failure to upload 206.223.170.146

Posted: Sat Nov 21, 2020 10:33 pm
by tmccarty729
Stuck sending to 170.146 also.

Re: failure to upload 206.223.170.146

Posted: Sun Nov 22, 2020 9:39 am
by Neil-B
Server still under maintenance

Re: failure to upload 206.223.170.146

Posted: Sun Nov 22, 2020 3:35 pm
by MoelTryfan
Still down at 1535 UTC