Page 1 of 1

WS server for GPU down?

Posted: Tue May 19, 2020 3:20 pm
by Reuzenkakatoe
Good day,

I'm aware of the recent server overload issues but I would still like to make sure that my system is fully operational.

I have moved my (previously working with F@H) GPU card to another machine. My new machine has one CPU slot and one GPU slot.
The CPU slot is working fine but the GPU fails to contact the F@H server. Running F@H 7.6.13. The relevant part of the logfile:

14:47:44:WU01:FS01:Connecting to assign1.foldingathome.org:80
14:47:45:WU01:FS01:Assigned to work server 192.0.2.1
14:47:45:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF108 [GeForce GT 630] 311 from 192.0.2.1
14:47:45:WU01:FS01:Connecting to 192.0.2.1:8080
14:48:06:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
14:48:06:WU01:FS01:Connecting to 192.0.2.1:80
14:48:27:ERROR:WU01:FS01:Exception: Failed to connect to 192.0.2.1:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

I have configured the Windows 10 firewall to allow F@H through - and also tried with the firewall disabled.
Below a Trace route:

C:\Windows\system32>tracert 192.0.2.1

Tracing route to 192.0.2.1 over a maximum of 30 hops

1 <1 ms <1 ms <1 ms 192.168.0.253 # <- my home router
2 8 ms 6 ms 5 ms 10.255.168.1 # <- my ISP
3 8 ms 6 ms 7 ms mnd-rc0001-cr101-ae95-0.core.as9143.net [213.51.175.217]
4 * * * Request timed out.
5 * * * Request timed out.
6 * * * Request timed out.
7 * * * Request timed out.

This has been going on for several days now, so I would just like a quick confirmation: is the F@H WS really down or is my system to blame?

Thanks very much in advance,

Re: WS server for GPU down?

Posted: Tue May 19, 2020 3:23 pm
by Neil-B
Can you post your log including top 200 lines so we can see the configuration … That message indicates there may be an issue since the 192.0.2.1 is where the AS assigns requests where the GPU is not capable of folding for some reason or other.

For help on posting logs please see viewtopic.php?f=61&t=26036

Your card may be right on the borderline of being able to fold - there have been a few issues recently where cards have "dropped off" the list where they shouldn't … I'll link a post explaining shortly.

Different card … but might be the same type of issue … viewtopic.php?f=80&t=34525&start=15#p334242.

Was the swap from one machine to the other "straight away" or was there some time lag (weeks/months) between this?

Re: WS server for GPU down?

Posted: Tue May 19, 2020 3:48 pm
by JimboPalmer
14:47:45:WU01:FS01:Assigned to work server 192.0.2.1
14:47:45:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF108 [GeForce GT 630] 311 from 192.0.2.1

192.0.2.1 is where the software sends you if your hardware won't fold

https://www.techpowerup.com/gpu-specs/g ... t-630.c816

Your card meets the minimum specifications to begin folding, It supports OpenCL 1.1 and double precision floating point math. (F@H actually tests with OpenCL 1.2)

With only 96 threads, I doubt it can finish folding before the deadline.

Re: WS server for GPU down?

Posted: Tue May 19, 2020 4:15 pm
by Neil-B
Those threads are double speed so this card is slightly quicker than my K420 (according to various comparison sites - as far as one can trust these) has which shaved in below the deadline on the few WUs I have run on it … You may find that at the moment with a large pool of GPU folders with fast kit that most WUs that you fold with this (if we can get it working for you) will have been reissued at timeout and returned by a quicker machine well before yours finishes the WU … It will be worth using the WU Status app to check this.

Re: WS server for GPU down?

Posted: Tue May 19, 2020 6:32 pm
by Reuzenkakatoe
Problem solved! All hints together made me decide to put the GPU card back into its original machine.
It's working again. The working rig is an AMD K10-5800K on a fast Gigabyte motherboard with 32 Gigs of very fast RAM.
The non-working rig was a rather ancient Intel Q6600 on a prehistoric Asus P5 main board with only 4Gigs of DDR2. Also a quad-core, but way slower than the K10-Gigabyte combination.
The F@H server probably correctly rejected this slow old piece of junk. I guess a GPU relies on support from a decent CPU on a fast system bus.

Next time I'll try to do the obvious instead of calling for help. I apologize for taking up you people's valuable time.

Great, so many very helpful answers in such a short time. You're all great people and I will endeavor to support Folding@Home.
You all helped me out in a tremendous way. Thanks so much to all of you!

Re: WS server for GPU down?

Posted: Tue May 19, 2020 6:42 pm
by bruce
I guess a GPU relies on support from a decent CPU on a fast system
Actually, the definition of a decent CPU is pretty broad. Each GPU will generally need one CPU thread which will be dedicated to it. (It can be a tread on a pretty slow GPU if that's what you have in your system.)

Re: WS server for GPU down?

Posted: Tue May 19, 2020 6:50 pm
by HugoNotte
I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.

Re: WS server for GPU down?

Posted: Tue May 19, 2020 7:06 pm
by Neil-B
At the moment where the GPU resource available is greater than the GPU WU availability (for the most part) Timed out WUs do get reissued fairly soon after Timeout .. Once the QPU WU pool grows to meet demand (or the GPU resource shrinks) Timed out WUs only get reissued once they get to the head of the queue. So under "normal" circumstances as long as the GPU folds WUs within Deadline it would probably be worth doing so - under current circumstances there is a fair chance that a reissued WU will complete well within the original Deadline.

Re: WS server for GPU down?

Posted: Thu May 21, 2020 8:06 am
by Reuzenkakatoe
HugoNotte wrote:I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.
My GT 630 also often exceeds the timeout. Timeouts are often 24 hours, which is rather tight. However, my understanding was that finishing before the expiration date is still useful to the F@H project. Is this true? If not, I'll turn off the GPU altogether so that the running CPU threads get some more breathing space.
Interesting stuff, this.

Re: WS server for GPU down?

Posted: Thu May 21, 2020 8:49 am
by Neil-B
Under normal circumstances even after timeout the WU would be in a queueing system and might not be reissued for days - so missing timeout may well still mean you return the WU before anyone else, even right the way up to expiration.

At the moment there is a fair chance that a WU will be reissued fairly shortly after timeout and so if missing timeout by more than the time it might take a fast GPU to fold the WU then they may well get there first.

Since the reissued WU might end up on another slower GPU for the most part if the WU you are processing will complete clearly before expiration then letting it do so seems fine to me ... Some people may advise your GPU is too slow - but really if it can complete within expiration then it isn't imo (and by definition in FAH's opinion)

Re: WS server for GPU down?

Posted: Thu May 21, 2020 6:53 pm
by Reuzenkakatoe
Thanks Neil for your clear answer. Much appreciated.

Re: WS server for GPU down?

Posted: Thu May 21, 2020 7:04 pm
by bruce
Suppose your GPU doesn't finish the WU before that Timeout clock expires and the WU does get duplicated. You've already got a 1 day (or whatever) head-start on completing the WU. As has already been stated, there's no way to predict which person will complete the first and who will complete it second so it's certainly worth continuing to work on the WU. As far as science is concerned, the first return of the result will generate the next Gen in that trajectory and science will move on. FAH gives both persons credit for finishing the WU as long as it's before the Final Deadline.

Everyone's deadlines are established at the time the WU is assigned so yours don't coincide with the other person's deadlines.

Re: WS server for GPU down?

Posted: Thu May 21, 2020 7:22 pm
by PantherX
bruce wrote:...FAH gives both persons credit for finishing the WU as long as it's before the Final Deadline.
Just to expand what bruce said, here's the points overview:
Before the Timeout: Base credits + Bonus points
After Timeout and before Expiration: Base credits

Re: WS server for GPU down?

Posted: Thu May 21, 2020 8:46 pm
by Joe_H
Reuzenkakatoe wrote:
HugoNotte wrote:I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.
My GT 630 also often exceeds the timeout. Timeouts are often 24 hours, which is rather tight. However, my understanding was that finishing before the expiration date is still useful to the F@H project. Is this true? If not, I'll turn off the GPU altogether so that the running CPU threads get some more breathing space.
Interesting stuff, this.
Besides the other answers given, it is hard to put out a hard and fast rule for the GT 630 and some of the other cards nVidia has branded on the low end. They have used the GT 630 designation for at least 4 different desktop cards. Two of them were based on the Fermi GPU chips, basically rebranded GT 400 series cards. The other two were based on Kepler chips and have more shader cores than the Fermi based cards.

Re: WS server for GPU down?

Posted: Mon Jun 01, 2020 9:25 pm
by Boluker
Hi folks! Our poor work servers are straining under the load from all of you amazing folks contributing your spare computing cycles. We're actively working to spin up more servers on our end to handle the load, but please bear with us---it may take another day or two before we can fully scale up to handle the load.