Sporadic issues with 171.67.108.xx servers

Moderators: Site Moderators, PandeGroup

Re: Sporadic issues with 171.67.108.xx servers

Postby foldy » Mon Jun 12, 2017 8:58 pm

Any chance to try a different LAN switch? But it still may be something totally different.
foldy
 
Posts: 911
Joined: Sat Dec 01, 2012 3:43 pm

Re: Sporadic issues with 171.67.108.xx servers

Postby ComputerGenie » Mon Jun 12, 2017 9:20 pm

foldy wrote:Any chance to try a different LAN switch? But it still may be something totally different.

For entirely unrelated reasons (the 8-port that was there was needed elsewhere), the 4-port that they share was just replaced on Saturday (and my issue started on Wednesday). And I'm nearly certain that there are 0 issues with the 24-port that the Win 7 box is on.
I'm at wits end over this, mostly due to what all else is connected and working in the totality of the networking. :(
User avatar
ComputerGenie
 
Posts: 242
Joined: Mon Dec 12, 2016 4:06 am

Re: Sporadic issues with 171.67.108.xx servers

Postby bruce » Tue Jun 13, 2017 12:02 am

Does your router (or Proxy) use a smart protocol that drops "unnecessary" packets? I've seen cases where that caused errors.

I'm particularly interested in any one thing that's between each of your machines and the internet so the idea that it might be a flaky switch was a good one.
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Sporadic issues with 171.67.108.xx servers

Postby _r2w_ben » Tue Jun 13, 2017 1:06 am

Do you by any chance have multiple ISPs with load balancing between the connections?
_r2w_ben
 
Posts: 115
Joined: Wed Apr 23, 2008 3:11 pm

Re: Sporadic issues with 171.67.108.xx servers

Postby ComputerGenie » Tue Jun 13, 2017 1:21 am

bruce wrote:Does your router (or Proxy) use a smart protocol that drops "unnecessary" packets? I've seen cases where that caused errors.

I'm particularly interested in any one thing that's between each of your machines and the internet so the idea that it might be a flaky switch was a good one.
The routers are just C7s (issue remains with "DoS Protection" on or off; as far as I know, that's the only such filter of that type) and we use base-model TP-Link switches (which show all "bad packets" rows to read 0).

_r2w_ben wrote:Do you by any chance have multiple ISPs with load balancing between the connections?
Sadly, 1 ISP (and 1 connection per) is all we can get here.
User avatar
ComputerGenie
 
Posts: 242
Joined: Mon Dec 12, 2016 4:06 am

Re: Sporadic issues with 171.67.108.xx servers

Postby foldy » Tue Jun 13, 2017 10:25 am

If we cannot find out the reason for the stuck download problem, a workaround could be a script which checks every 1 min if a gpu slot is idle then waits for 20 min if it gets running again (max download time is 15min I think until server closes connection). If it is not running again then the FahClient is restarted.

I'm not sure about the server timeout of a download, is it 15min? Because if we restart too early then a good big download may never finish in our timeout.

A similar workaround could be implemented in the FahClient itself except that it would only cancel the download and restart the slot.
foldy
 
Posts: 911
Joined: Sat Dec 01, 2012 3:43 pm

Re: Sporadic issues with 171.67.108.xx servers

Postby bruce » Tue Jun 13, 2017 9:43 pm

foldy wrote:I'm not sure about the server timeout of a download, is it 15min?


The default server timeout is, in fact, 15 minutes. If an upload or download cannot be completed by the timeout, the connection is dropped.

In the example posted on the previous page, we see
Code: Select all
11:36:24:WU01:FS01:Connecting to 171.67.108.102:8080
11:36:34:WU01:FS01:Downloading 7.06MiB
11:36:40:WU01:FS01:Download 94.77%
11:36:40:WU01:FS01:Download complete

This took 6 seconds on his ISP, so it's not a factor. Ideally, the protocol recognizes that the download was completed ("Download complete") and the connection closes naturally. Otherwise, the connection remains "active" for another 14m54s before being forcefully dropped.

There are some cases where the server timeout is adjusted (increased). The goal is to down-/up-load the data package over especially slow donor connections without unnecessary interruptions.
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Sporadic issues with 171.67.108.xx servers

Postby ComputerGenie » Fri Jun 16, 2017 3:46 pm

Well, as inexplicably as it started, it stopped. Not a single problem in the last 48 hours.
Thanks to all that tried to help. Good folding to all...
User avatar
ComputerGenie
 
Posts: 242
Joined: Mon Dec 12, 2016 4:06 am

Re: Sporadic issues with 171.67.108.xx servers

Postby ComputerGenie » Wed Jun 21, 2017 9:17 pm

foldy wrote:If we cannot find out the reason for the stuck download problem, a workaround could be a script which checks every 1 min if a gpu slot is idle then waits for 20 min if it gets running again (max download time is 15min I think until server closes connection). If it is not running again then the FahClient is restarted...

Well, it looks like I'm going to have to get off my lazy butt and write that, because I'm having it all over again. :cry:
User avatar
ComputerGenie
 
Posts: 242
Joined: Mon Dec 12, 2016 4:06 am

Re: Sporadic issues with 171.67.108.xx servers

Postby bruce » Thu Jun 22, 2017 10:29 pm

It would be helpful to identify SPECIFIC servers that are having this problem. I suspect that the problems are not shared equally across the various 171.67.108.xx servers. f possible, specific error message(s) associated with each server would help, too.

From the previous page, this is lacking that sort of information
ComputerGenie wrote:
rwh202 wrote:That log shows a failed download from 171.67.108.157:8080 and a successful one from 171.67.108.102:8080

Have you ever had success from .157 or a failure from another server that has sometimes worked e.g. .102...
It has happened with more than one (ironically, .102 is usually the most afflicted; however, that could just be because it's also the one most designated).
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Sporadic issues with 171.67.108.xx servers

Postby bruce » Thu Jun 22, 2017 10:49 pm

ComputerGenie wrote:Is there a possible chance that it's a time sync issue (either in the software or on my end)?
I'm looking at the most recent WU...
on the "Status" tab, it says "Assigned: 2017-06-12T19:02:46Z"
in the log is: "19:02:22:WU00:FS00:0x21:Completed 187500 out of 6250000 steps (3%)"
Meaning that 24 seconds before status claims it was assigned, it was 3% done. :shock:

The "Assigned" time (and eventually, the Upload-Complete time) are based on the Server's clock. (That way adjusting your local clock does not alter the deadlines or the bonus calculation.) The timestamps in your log are based on your local clock.

Apparently your local clock was not aligned with the server clock, but that shouldn't cause any significant problems..
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Sporadic issues with 171.67.108.xx servers

Postby bruce » Thu Jun 22, 2017 10:54 pm

bruce wrote:1) please reset verbosity to the default value.
2) The key piece of information in that log is GPUs: 0. Why do you need to restart?


Did I miss your answers somewhere?
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Sporadic issues with 171.67.108.xx servers

Postby ComputerGenie » Fri Jun 23, 2017 12:02 am

bruce wrote:
bruce wrote:1) please reset verbosity to the default value.
2) The key piece of information in that log is GPUs: 0. Why do you need to restart?


Did I miss your answers somewhere?

That 0 was do to a clearing of files. Basically, in FAHClient's mind it was a 1st run. Successive boots showed correct # of GPUs.
The why reboot and why dump files was there not much left to try and I was willing to try anything.

Seems "Cancer" is the answer for me; Parkinson's, Alzheimer's, and "Any" all lead to the no download issue (and Huntington's was "hit and miss" as to if it would work right or not). I'm not sure why it is that way, but after this much time messing with it and this many hours of no science being done (vs the Parkinson's that I actually want and the reason I got started folding), any science filed is better than nothing.
User avatar
ComputerGenie
 
Posts: 242
Joined: Mon Dec 12, 2016 4:06 am

Previous

Return to Issues with a specific server

Who is online

Users browsing this forum: No registered users and 2 guests

cron