128.252.203.10 problem or WU?

Moderators: Site Moderators, FAHC Science Team

PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

Neil-B wrote:...I have never worked out if CS is set when project is created, when WU is issued, or continually when infrastructure is adjusted...
The CS is set on a Project level and is optional. If the project started with no CS and then it was later added, it will only take effect on the new WUs, not the old ones. That's my understanding based on observation which may or may not have changed given the various tweaks and optimizations done to the infrastructure over the last month or so.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 128.252.203.10 problem or WU?

Post by Neil-B »

Ta for that ... makes sense from a number of things I've seen :)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
esfishox
Posts: 5
Joined: Fri Mar 27, 2020 4:16 am
Hardware configuration: Gigabyte Windforce RTX 4080 on Ubuntu 22.04
Gigabyte WC RTX 3080 LHR on Windows 11

Re: 128.252.203.10 problem or WU?

Post by esfishox »

I saw my WU uploaded after nearly three days of mostly trying.
https://apps.foldingathome.org/wu#proje ... 163&gen=59
Image
GDF
Posts: 8
Joined: Mon May 04, 2020 7:53 pm

Re: 128.252.203.10 problem or WU?

Post by GDF »

The server seems to be rebooting every 20 minutes or so. It often goes 5-10 minutes without updating the last contact timestamp. It has been doing so since I started watching it two days ago. That doesn't sound normal.
level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

Hello. We are a new team that has been going for 2 weeks and 2 days.

I am having the same problem with this troubled server for more than 1 day:

Code: Select all

02:30:40:WU03:FS00:0xa7:Completed 80000 out of 250000 steps (32%)
02:32:40:WU00:FS01:0x22:Completed 390000 out of 1000000 steps (39%)
02:32:50:WU01:FS01:Upload 1.63%
02:32:56:WU01:FS01:Upload 7.05%
02:33:03:WU01:FS01:Upload 11.92%
02:33:03:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
02:33:03:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11760 run:0 clone:2274 gen:19 core:0x22 unit:0x0000002680fccb0a5e6d7ce977531da8
02:33:04:WU01:FS01:Uploading 23.06MiB to 128.252.203.10
02:33:04:WU01:FS01:Connecting to 128.252.203.10:8080
02:33:10:WU01:FS01:Upload 6.23%
02:33:16:WU01:FS01:Upload 13.55%
02:33:22:WU01:FS01:Upload 20.60%
02:33:28:WU01:FS01:Upload 27.91%
02:33:29:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
02:34:23:WU03:FS00:0xa7:Completed 82500 out of 250000 steps (33%)
02:37:19:WU00:FS01:0x22:Completed 400000 out of 1000000 steps (40%)
02:38:25:WU03:FS00:0xa7:Completed 85000 out of 250000 steps (34%)
02:39:55:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11760 run:0 clone:2274 gen:19 core:0x22 unit:0x0000002680fccb0a5e6d7ce977531da8
02:39:55:WU01:FS01:Uploading 23.06MiB to 128.252.203.10
02:39:55:WU01:FS01:Connecting to 128.252.203.10:8080
02:40:16:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80
02:40:16:WU01:FS01:Connecting to 128.252.203.10:80
02:40:19:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: No connection could be made because the target machine actively refused it.
I see from https://apps.foldingathome.org/serverstats that 155.247.164.213 runs the same version and works on the same project types. Is there any way to force F@H to use another server? Would that even work? Do they only expect results that were assigned? I could set up a proxy to try to force it, maybe?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 128.252.203.10 problem or WU?

Post by bruce »

:( I'm frustrated, too. I've repeatedly reported problems with that server to the people who can fix it. They get it running and before long, it fails again. :x
level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

I was working on trying to set up a web proxy, thinking I was going to lose this WU anyway, and that required a reboot. It failed once more, and then:

Code: Select all

05:37:19:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11760 run:0 clone:2274 gen:19 core:0x22 unit:0x0000002680fccb0a5e6d7ce977531da8
05:37:19:WU01:FS01:Uploading 23.06MiB to 128.252.203.10
05:37:19:WU01:FS01:Connecting to 128.252.203.10:8080
05:37:20:WU03:FS00:0xa7:Completed 200000 out of 250000 steps (80%)
05:37:25:WU01:FS01:Upload 7.59%
05:37:31:WU01:FS01:Upload 14.63%
05:37:37:WU01:FS01:Upload 20.87%
05:37:43:WU01:FS01:Upload 27.91%
05:37:49:WU01:FS01:Upload 34.96%
05:37:55:WU01:FS01:Upload 42.00%
05:38:01:WU01:FS01:Upload 49.05%
05:38:07:WU01:FS01:Upload 56.10%
05:38:13:WU01:FS01:Upload 62.87%
05:38:19:WU01:FS01:Upload 66.93%
05:38:25:WU01:FS01:Upload 71.54%
05:38:31:WU01:FS01:Upload 78.59%
05:38:37:WU01:FS01:Upload 85.63%
05:38:43:WU01:FS01:Upload 92.68%
05:38:49:WU01:FS01:Upload 99.72%
05:38:55:WU01:FS01:Upload complete
05:38:55:WU01:FS01:Server responded WORK_ACK (400)
05:38:55:WU01:FS01:Final credit estimate, 27322.00 points
05:38:55:WU01:FS01:Cleaning up
This happens to be a Windows 10 box running FAH 7.6.9, btw. Man, those points are pitiful, but it's arbitrary. It's the WUs that count, right.

We're comin' for ya, Corona.
level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

bruce wrote::( I'm frustrated, too. I've repeatedly reported problems with that server to the people who can fix it. They get it running and before long, it fails again. :x
It was probably your reporting that did it. I see a 16-minute uptime on it, now. Thanks, man!

These guys need some real CS help. I'm sure it's easier to make that judgement never having seen the complexities involved behind that curtain. Still,... I hope they have people dedicated to this sort of thing, and it's not stealing from the bio-physicists' time. I've seen the crazy specs required to run a server. They aren't normal machines, for sure.
Oussebon
Posts: 5
Joined: Mon Mar 16, 2020 3:10 pm

Re: 128.252.203.10 problem or WU?

Post by Oussebon »

One of our machines is still wrestling with sending a WU to this server. Mostly it says Transfer failed, but will periodically start uploading, then get interrupted.

For instance:

Code: Select all

*********************** Log Started 2020-05-05T14:55:53Z ***********************
14:55:53:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
14:55:53:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
14:55:53:WU00:FS02:Connecting to 128.252.203.10:8080
14:55:59:WU00:FS02:Upload 0.45%
14:56:46:WU00:FS02:Upload 0.57%
14:56:53:WU00:FS02:Upload 0.79%
14:56:59:WU00:FS02:Upload 1.24%
14:57:05:WU00:FS02:Upload 1.81%
14:57:12:WU00:FS02:Upload 2.26%
14:57:18:WU00:FS02:Upload 2.38%
14:57:27:WU00:FS02:Upload 2.49%
14:57:33:WU00:FS02:Upload 2.83%
14:57:39:WU00:FS02:Upload 3.28%
14:57:47:WU00:FS02:Upload 3.62%
14:57:53:WU00:FS02:Upload 3.96%
14:58:07:WU00:FS02:Upload 4.07%
14:58:13:WU00:FS02:Upload 4.75%
14:58:20:WU00:FS02:Upload 5.32%
14:58:26:WU00:FS02:Upload 6.00%
14:58:32:WU00:FS02:Upload 6.68%
14:58:39:WU00:FS02:Upload 7.24%
14:58:45:WU00:FS02:Upload 7.81%
14:58:54:WU00:FS02:Upload 8.37%
14:59:00:WU00:FS02:Upload 8.71%
14:59:06:WU00:FS02:Upload 9.28%
14:59:12:WU00:FS02:Upload 9.96%
14:59:21:WU00:FS02:Upload 10.41%
14:59:29:WU00:FS02:Upload 11.09%
14:59:35:WU00:FS02:Upload 11.77%
14:59:43:WU00:FS02:Upload 12.45%
14:59:50:WU00:FS02:Upload 12.56%
14:59:56:WU00:FS02:Upload 12.78%
15:00:02:WU00:FS02:Upload 13.01%
15:00:08:WU00:FS02:Upload 13.35%
15:00:14:WU00:FS02:Upload 13.69%
15:00:20:WU00:FS02:Upload 14.26%
15:00:26:WU00:FS02:Upload 14.82%
15:00:32:WU00:FS02:Upload 15.50%
15:00:38:WU00:FS02:Upload 16.07%
15:00:44:WU00:FS02:Upload 16.63%
15:00:51:WU00:FS02:Upload 17.20%
15:00:57:WU00:FS02:Upload 17.99%
15:01:04:WU00:FS02:Upload 18.55%
15:01:10:WU00:FS02:Upload 19.23%
15:01:17:WU00:FS02:Upload 19.69%
15:01:23:WU00:FS02:Upload 20.14%
15:01:29:WU00:FS02:Upload 20.93%
15:01:36:WU00:FS02:Upload 21.38%
15:01:42:WU00:FS02:Upload 21.84%
15:01:48:WU00:FS02:Upload 22.74%
15:01:58:WU00:FS02:Upload 23.53%
15:02:04:WU00:FS02:Upload 24.44%
15:02:10:WU00:FS02:Upload 25.34%
15:02:16:WU00:FS02:Upload 26.25%
15:02:22:WU00:FS02:Upload 27.27%
15:02:28:WU00:FS02:Upload 28.28%
15:02:34:WU00:FS02:Upload 29.19%
15:02:47:WU00:FS02:Upload 29.87%
15:02:53:WU00:FS02:Upload 30.32%
15:03:01:WU00:FS02:Upload 31.11%
15:03:07:WU00:FS02:Upload 32.13%
15:03:13:WU00:FS02:Upload 33.04%
15:04:50:WU00:FS02:Upload 33.83%
15:04:50:WARNING:WU00:FS02:Exception: Failed to send results to work server: Transfer failed
15:04:51:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:04:51:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:04:51:WU00:FS02:Connecting to 128.252.203.10:8080
15:04:54:WARNING:WU00:FS02:WorkServer connection failed on port 8080 trying 80
15:04:54:WU00:FS02:Connecting to 128.252.203.10:80
15:04:58:WARNING:WU00:FS02:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: No connection could be made because the target machine actively refused it.
15:06:28:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:06:28:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:06:28:WU00:FS02:Connecting to 128.252.203.10:8080
15:06:30:WARNING:WU00:FS02:WorkServer connection failed on port 8080 trying 80
15:06:30:WU00:FS02:Connecting to 128.252.203.10:80
15:06:33:WARNING:WU00:FS02:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: No connection could be made because the target machine actively refused it.
15:09:05:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:09:05:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:09:05:WU00:FS02:Connecting to 128.252.203.10:8080

Code: Select all

*********************** Log Started 2020-05-05T15:19:53Z ***********************
15:19:53:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:19:53:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:19:53:WU00:FS02:Connecting to 128.252.203.10:8080
15:20:08:WU00:FS02:Upload 0.23%
15:20:08:WARNING:WU00:FS02:Exception: Failed to send results to work server: Transfer failed
15:20:09:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:20:09:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:20:09:WU00:FS02:Connecting to 128.252.203.10:8080
15:20:24:WU00:FS02:Upload 0.11%
15:20:56:WU00:FS02:Upload 0.23%
15:20:56:WARNING:WU00:FS02:Exception: Failed to send results to work server: Transfer failed
15:21:09:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:21:09:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:21:09:WU00:FS02:Connecting to 128.252.203.10:8080
15:22:19:WU00:FS02:Upload 0.23%
15:22:19:WARNING:WU00:FS02:Exception: Failed to send results to work server: Transfer failed
15:22:46:WU00:FS02:Sending unit results: id:00 state:SEND error:NO_ERROR project:11764 run:0 clone:5195 gen:51 core:0x22 unit:0x0000005d80fccb0a5e71130f4744690a
15:22:46:WU00:FS02:Uploading 55.24MiB to 128.252.203.10
15:22:46:WU00:FS02:Connecting to 128.252.203.10:8080
15:23:07:WARNING:WU00:FS02:WorkServer connection failed on port 8080 trying 80
15:23:07:WU00:FS02:Connecting to 128.252.203.10:80
15:23:28:WARNING:WU00:FS02:Exception: Failed to send results to work server: Failed to connect to 128.252.203.10:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
It's been like this for several days now, with the PC and FAHclient left running throughout the night and for most of the daytime too each day. Restarting the PC and/or FAH client seems to make no difference. The server will apparently start taking the upload and then stop, usually after 0.23%, very occasionally - as above - after a lot more. I really thought today would be the day it would actually take the WU, but it failed at just under 34%. I saw the server was restarted recently, but that was before the attempts in the logs above.

Anything to be done?
GDF
Posts: 8
Joined: Mon May 04, 2020 7:53 pm

Re: 128.252.203.10 problem or WU?

Post by GDF »

This is only anecdotal, worked for me, and might have been complete coincidence. I paused the slot with the problem, waited for the server to reboot (which you can see on the serverstats page by watching uptime roll back to zero), then restarted the slot. The upload went right through.

It's annoying that there have been days of reports about this and no real response. But I get that there are a lot of moving parts and a federated (or loosely collaborative?) management structure. It would just be nice to know that the problem has been formally reported to someone who can address it.
anandhanju
Posts: 526
Joined: Mon Dec 03, 2007 4:33 am
Location: Australia

Re: 128.252.203.10 problem or WU?

Post by anandhanju »

Thanks for your reports. The necessary folks have been notified and they will be looking into this.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

GDF wrote:...It's annoying that there have been days of reports about this and no real response. But I get that there are a lot of moving parts and a federated (or loosely collaborative?) management structure. It would just be nice to know that the problem has been formally reported to someone who can address it.
Welcome to the F@H Forum GDF,

I do understand your POV and it negatively impacts all involved, the researchers and the donors. However, considering that there are multiple labs involved (https://foldingathome.org/about/the-fol ... onsortium/) across the globe in various countries dealing with various lock-down policies, even on a "good" day, it would take a bit of time. In a pandemic situation, it is a lot harder but no-one has given up and instead, they have double-down and working to improving various aspects to ensure that it is fixed. Sometimes, labs will have to involve their internal IT department which can also add to the delay if it is a University infrastructure limitation like internet or electricity.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

There will always be trouble, somewhere, though. We just need a better way of redirecting the work, so we aren't feeling like our electricity bill was not worth it (I personally don't care, but many others might). Especially when we have enough smarts to determine there is a problematic server, even giving us a way to do this on our end would be a great advancement. Heck, the more complicated and challenging the better for some of us... IF it's possible.

I have seen other gentle suggestions like this go unanswered, as I've lurked around the last couple of weeks. Is it that it's an ignorant concept that we will learn better about wishing for, as we learn more of the details? Or, is it that no one knows whether it is possible? If the latter, then there is hope and we can play. ;)

Is that communication locked into place, once the job begins? If A assigned to me, I'm reporting to B, but B breaks, is there no possibility of a C that can accept the work and pass it on to B later to aggregate? Assuming B and C are similar in every important way (if so, what are those ways?... same projects, same arch job types, same version of F@H?) Is B the only machine who will ever accept the final data for this job? Or, can any similar server accept it, and it's just a matter of luck and there not yet being a mechanism in place to send it to C?
level6
Posts: 13
Joined: Tue May 05, 2020 2:35 am
Hardware configuration: See the current list, here: https://www.leper.org/FAH/level6/client_stats.html
Location: Dallas, Texas, USA
Contact:

Re: 128.252.203.10 problem or WU?

Post by level6 »

And, if our client knows where B is... then it must have saved that in a file, somewhere, right? Could a file be changed to replace B with C? That seems too simple to work. There are no plain strings of my collection server's IP address in these files. Is it maybe encoded in that client.db sqlite DB?
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: 128.252.203.10 problem or WU?

Post by PantherX »

level6 wrote:There will always be trouble, somewhere, though. We just need a better way of redirecting the work, so we aren't feeling like our electricity bill was not worth it (I personally don't care, but many others might). Especially when we have enough smarts to determine there is a problematic server, even giving us a way to do this on our end would be a great advancement. Heck, the more complicated and challenging the better for some of us... IF it's possible...
Work has happened over the last few weeks were multiple WS (Work Servers) spawned and some included cloud services too. There is still more to come.
level6 wrote:...I have seen other gentle suggestions like this go unanswered, as I've lurked around the last couple of weeks. Is it that it's an ignorant concept that we will learn better about wishing for, as we learn more of the details? Or, is it that no one knows whether it is possible? If the latter, then there is hope and we can play. ;) ...
Sorry, I am not following you 100% There has been engagement between the F@H Team and other parties across Forum, Email, Twitter, Discord, etc. AFAIK, for troubleshooting details were asked and Donors responded. Troubleshooting issues in production can be challenging especially when new features have been deployed and rolling forward is the only way.
level6 wrote:...Is that communication locked into place, once the job begins? If A assigned to me, I'm reporting to B, but B breaks, is there no possibility of a C that can accept the work and pass it on to B later to aggregate? Assuming B and C are similar in every important way (if so, what are those ways?... same projects, same arch job types, same version of F@H?) Is B the only machine who will ever accept the final data for this job? Or, can any similar server accept it, and it's just a matter of luck and there not yet being a mechanism in place to send it to C?
The downloaded WU can only be uploaded to the WS that it came from. Historically, there was the CS (Collection Server) which was optional and up to the researcher to configure or not. They only collect WUs. The reason a WS can only accept the WU is that WUs are sequential and once it is uploaded, the next one in sequence is generated. Thus, WUs from a particular WS has to be returned to it. If it does go to the CS, the WS has to "pull" it back and then process it to generate the next sequence. If you would like to know a bit more about this, please read this topic as it provides an overview of the various servers at play and their roles: viewtopic.php?f=18&t=17794
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply