WS server for GPU down?

Moderators: Site Moderators, FAHC Science Team

Post Reply
Reuzenkakatoe
Posts: 6
Joined: Wed Apr 01, 2020 11:41 am

WS server for GPU down?

Post by Reuzenkakatoe »

Good day,

I'm aware of the recent server overload issues but I would still like to make sure that my system is fully operational.

I have moved my (previously working with F@H) GPU card to another machine. My new machine has one CPU slot and one GPU slot.
The CPU slot is working fine but the GPU fails to contact the F@H server. Running F@H 7.6.13. The relevant part of the logfile:

14:47:44:WU01:FS01:Connecting to assign1.foldingathome.org:80
14:47:45:WU01:FS01:Assigned to work server 192.0.2.1
14:47:45:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF108 [GeForce GT 630] 311 from 192.0.2.1
14:47:45:WU01:FS01:Connecting to 192.0.2.1:8080
14:48:06:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
14:48:06:WU01:FS01:Connecting to 192.0.2.1:80
14:48:27:ERROR:WU01:FS01:Exception: Failed to connect to 192.0.2.1:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

I have configured the Windows 10 firewall to allow F@H through - and also tried with the firewall disabled.
Below a Trace route:

C:\Windows\system32>tracert 192.0.2.1

Tracing route to 192.0.2.1 over a maximum of 30 hops

1 <1 ms <1 ms <1 ms 192.168.0.253 # <- my home router
2 8 ms 6 ms 5 ms 10.255.168.1 # <- my ISP
3 8 ms 6 ms 7 ms mnd-rc0001-cr101-ae95-0.core.as9143.net [213.51.175.217]
4 * * * Request timed out.
5 * * * Request timed out.
6 * * * Request timed out.
7 * * * Request timed out.

This has been going on for several days now, so I would just like a quick confirmation: is the F@H WS really down or is my system to blame?

Thanks very much in advance,
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: WS server for GPU down?

Post by Neil-B »

Can you post your log including top 200 lines so we can see the configuration … That message indicates there may be an issue since the 192.0.2.1 is where the AS assigns requests where the GPU is not capable of folding for some reason or other.

For help on posting logs please see viewtopic.php?f=61&t=26036

Your card may be right on the borderline of being able to fold - there have been a few issues recently where cards have "dropped off" the list where they shouldn't … I'll link a post explaining shortly.

Different card … but might be the same type of issue … viewtopic.php?f=80&t=34525&start=15#p334242.

Was the swap from one machine to the other "straight away" or was there some time lag (weeks/months) between this?
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
JimboPalmer
Posts: 2573
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: WS server for GPU down?

Post by JimboPalmer »

14:47:45:WU01:FS01:Assigned to work server 192.0.2.1
14:47:45:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GF108 [GeForce GT 630] 311 from 192.0.2.1

192.0.2.1 is where the software sends you if your hardware won't fold

https://www.techpowerup.com/gpu-specs/g ... t-630.c816

Your card meets the minimum specifications to begin folding, It supports OpenCL 1.1 and double precision floating point math. (F@H actually tests with OpenCL 1.2)

With only 96 threads, I doubt it can finish folding before the deadline.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: WS server for GPU down?

Post by Neil-B »

Those threads are double speed so this card is slightly quicker than my K420 (according to various comparison sites - as far as one can trust these) has which shaved in below the deadline on the few WUs I have run on it … You may find that at the moment with a large pool of GPU folders with fast kit that most WUs that you fold with this (if we can get it working for you) will have been reissued at timeout and returned by a quicker machine well before yours finishes the WU … It will be worth using the WU Status app to check this.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Reuzenkakatoe
Posts: 6
Joined: Wed Apr 01, 2020 11:41 am

Re: WS server for GPU down?

Post by Reuzenkakatoe »

Problem solved! All hints together made me decide to put the GPU card back into its original machine.
It's working again. The working rig is an AMD K10-5800K on a fast Gigabyte motherboard with 32 Gigs of very fast RAM.
The non-working rig was a rather ancient Intel Q6600 on a prehistoric Asus P5 main board with only 4Gigs of DDR2. Also a quad-core, but way slower than the K10-Gigabyte combination.
The F@H server probably correctly rejected this slow old piece of junk. I guess a GPU relies on support from a decent CPU on a fast system bus.

Next time I'll try to do the obvious instead of calling for help. I apologize for taking up you people's valuable time.

Great, so many very helpful answers in such a short time. You're all great people and I will endeavor to support Folding@Home.
You all helped me out in a tremendous way. Thanks so much to all of you!
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WS server for GPU down?

Post by bruce »

I guess a GPU relies on support from a decent CPU on a fast system
Actually, the definition of a decent CPU is pretty broad. Each GPU will generally need one CPU thread which will be dedicated to it. (It can be a tread on a pretty slow GPU if that's what you have in your system.)
HugoNotte
Posts: 70
Joined: Tue Apr 07, 2020 7:09 pm

Re: WS server for GPU down?

Post by HugoNotte »

I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: WS server for GPU down?

Post by Neil-B »

At the moment where the GPU resource available is greater than the GPU WU availability (for the most part) Timed out WUs do get reissued fairly soon after Timeout .. Once the QPU WU pool grows to meet demand (or the GPU resource shrinks) Timed out WUs only get reissued once they get to the head of the queue. So under "normal" circumstances as long as the GPU folds WUs within Deadline it would probably be worth doing so - under current circumstances there is a fair chance that a reissued WU will complete well within the original Deadline.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Reuzenkakatoe
Posts: 6
Joined: Wed Apr 01, 2020 11:41 am

Re: WS server for GPU down?

Post by Reuzenkakatoe »

HugoNotte wrote:I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.
My GT 630 also often exceeds the timeout. Timeouts are often 24 hours, which is rather tight. However, my understanding was that finishing before the expiration date is still useful to the F@H project. Is this true? If not, I'll turn off the GPU altogether so that the running CPU threads get some more breathing space.
Interesting stuff, this.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: WS server for GPU down?

Post by Neil-B »

Under normal circumstances even after timeout the WU would be in a queueing system and might not be reissued for days - so missing timeout may well still mean you return the WU before anyone else, even right the way up to expiration.

At the moment there is a fair chance that a WU will be reissued fairly shortly after timeout and so if missing timeout by more than the time it might take a fast GPU to fold the WU then they may well get there first.

Since the reissued WU might end up on another slower GPU for the most part if the WU you are processing will complete clearly before expiration then letting it do so seems fine to me ... Some people may advise your GPU is too slow - but really if it can complete within expiration then it isn't imo (and by definition in FAH's opinion)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Reuzenkakatoe
Posts: 6
Joined: Wed Apr 01, 2020 11:41 am

Re: WS server for GPU down?

Post by Reuzenkakatoe »

Thanks Neil for your clear answer. Much appreciated.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: WS server for GPU down?

Post by bruce »

Suppose your GPU doesn't finish the WU before that Timeout clock expires and the WU does get duplicated. You've already got a 1 day (or whatever) head-start on completing the WU. As has already been stated, there's no way to predict which person will complete the first and who will complete it second so it's certainly worth continuing to work on the WU. As far as science is concerned, the first return of the result will generate the next Gen in that trajectory and science will move on. FAH gives both persons credit for finishing the WU as long as it's before the Final Deadline.

Everyone's deadlines are established at the time the WU is assigned so yours don't coincide with the other person's deadlines.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: WS server for GPU down?

Post by PantherX »

bruce wrote:...FAH gives both persons credit for finishing the WU as long as it's before the Final Deadline.
Just to expand what bruce said, here's the points overview:
Before the Timeout: Base credits + Bonus points
After Timeout and before Expiration: Base credits
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: WS server for GPU down?

Post by Joe_H »

Reuzenkakatoe wrote:
HugoNotte wrote:I have tried folding on a GT 630M, which got the same specs as the desktop version apart from being 150 MHz slower. But I don't think that would make a big difference. It's really not worth it, since it exceeded the Timeout on every WU. It did manage to finish before Deadline, but I feel GPUs that regularly exceed Timeout put additional strain on the server, since the same WU then gets send out again.
My GT 630 also often exceeds the timeout. Timeouts are often 24 hours, which is rather tight. However, my understanding was that finishing before the expiration date is still useful to the F@H project. Is this true? If not, I'll turn off the GPU altogether so that the running CPU threads get some more breathing space.
Interesting stuff, this.
Besides the other answers given, it is hard to put out a hard and fast rule for the GT 630 and some of the other cards nVidia has branded on the low end. They have used the GT 630 designation for at least 4 different desktop cards. Two of them were based on the Fermi GPU chips, basically rebranded GT 400 series cards. The other two were based on Kepler chips and have more shader cores than the Fermi based cards.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Boluker
Posts: 1
Joined: Mon Jun 01, 2020 9:22 pm

Re: WS server for GPU down?

Post by Boluker »

Hi folks! Our poor work servers are straining under the load from all of you amazing folks contributing your spare computing cycles. We're actively working to spin up more servers on our end to handle the load, but please bear with us---it may take another day or two before we can fully scale up to handle the load.
Post Reply