Page 5 of 7

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 4:51 pm
by Ichbin3
And again ...

Code: Select all

16:32:10:WU01:FS00:Connecting to 155.247.166.220:8080
16:32:10:WU01:FS00:Downloading 3.75MiB
16:32:17:WU01:FS00:Download 5.01%
16:32:26:WU01:FS00:Download 6.68%
16:32:47:WU01:FS00:Download 8.34%
16:33:05:WU01:FS00:Download 11.68%

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 4:53 pm
by parkut
Stalled download, almost 3 hours now. Linux Client Version: 7.6.13, and Core Version: 0.0.11

Code: Select all

13:43:07:WU02:FS02:Connecting to assign1.foldingathome.org:80
13:43:07:WU02:FS02:Assigned to work server 155.247.166.220
13:43:07:WU02:FS02:Requesting new work unit for slot 02: READY gpu:0:GM206 [GeForce GTX 960] 2308 from 155.247.166.220
13:43:07:WU02:FS02:Connecting to 155.247.166.220:8080
13:43:08:WU02:FS02:Downloading 1.39MiB

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 5:04 pm
by _r2w_ben
On Windows, you can use TCPView to kill the stalled download. Find FAHClient.exe in the list, right click the connection with vav4.ocis.temple.edu as the Remote Address and then click Close Connection.

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 7:12 pm
by Ichbin3
I'm running linux.
Btw - the next stalled dl happend.
3 MB would have needed 1h hour to download.
I'm starting to hate that server.

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 7:40 pm
by HaloJones
Just applied a block on that IP to my firewall. In theory, the clients that are assigned to that server will fail to connect rather than connect but fail to download.

Re: 155.247.166.220 downloads stalled

Posted: Wed Jul 01, 2020 8:55 pm
by scott@bjorn3d
These servers are killing me doing work. Why don't they ever fix them?

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 7:34 am
by HaloJones
server needs a frequently scheduled reboot. it's a simple cronjob

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 8:36 am
by Sparkly
And another 3, so I am seriously considering blocking this server in my firewall permanently.

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 9:10 am
by Ichbin3
I did now too

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 8:52 pm
by rickoic
Have been having small problem with hung downloads over the past week with this server, 1-3 times a day a reboot was required. Been living with it as it was a minor problem. Woke this morning and 3 of 4 pcs required a reboot. Total of 4 gpus hung. Just checked my pcs again and had 1 hung up so I rebooted. Got 220 server again and it hung with downloading 3.75gb. Sat for a few minutes with no progress, so I rebooted again. And again I was assigned to 220. Got to 1.8% downloaded this time and then it hung. Total of 6 reboots and on the final reboot it caused my other 3 gpus on the board to throw their wu's and redownload new ones. But at least the problem child got sent to another server and is working again.

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 9:12 pm
by HaloJones
How is this not being sorted despite all this noise????

Re: 155.247.166.220 downloads stalled

Posted: Thu Jul 02, 2020 9:28 pm
by vvoelz
Sorry for the ongoing problems with vav4. We recently put up more GPU WUs and the connection issue has worsened. I just did a hard reboot of this machine, which we will continue to do every day from now on. Its unclear whether this will actually do the trick, so if you observe ANY amelioration of problem, please post.

I have said before that we are trying to retire this server, and we still are. Hopefully our new hardware will installed in 1-2 months.

Re: 155.247.166.220 downloads stalled

Posted: Fri Jul 03, 2020 12:01 am
by rickoic
Just had another failure to download from 220.

Re: 155.247.166.220 downloads stalled

Posted: Fri Jul 03, 2020 1:16 am
by vvoelz
UGH - if we didn't have vital projects being served from vav4 I'd shut it down. (The server code does not let us easily migrate projects from one server to another). There must be some load balancing issues (collection server relays?) that are beyond our control, but perhaps we can set some better parameters. We'll continue to push on this.

Re: 155.247.166.220 downloads stalled

Posted: Fri Jul 03, 2020 8:43 am
by Sparkly
vvoelz wrote:UGH - if we didn't have vital projects being served from vav4 I'd shut it down. (The server code does not let us easily migrate projects from one server to another). There must be some load balancing issues (collection server relays?) that are beyond our control, but perhaps we can set some better parameters. We'll continue to push on this.
I would be surprised if this issue was a load-balancing thing, since it is more likely that it is a TCP packet loss/corruption/retransmission thing.

Could be as simple as a security setting in your firewall, so if you have older Cisco FWSM stuff on your boarders, you can try turning the sequence randomisation off.

https://community.cisco.com/t5/security ... n_and_SACK