Recent WUs are too large to upload

Moderators: Site Moderators, FAHC Science Team

Markus_Laker
Posts: 20
Joined: Sun Dec 01, 2019 11:36 am

Recent WUs are too large to upload

Post by Markus_Laker »

I have a reasonably powerful PC (12 Threadripper cores, 24 threads), but a pitifully slow Internet connection that I can't upgrade because of my location. A couple of recent WUs have produced 70MB or 90MB of results. I get most of the way through an upload, and then the server times out after an hour, forcing me to start the upload over and over again. It never succeeds, and so I can never upload the results.

Code: Select all

01:12:43:WU00:FS00:Upload 88.91%
01:12:51:WU00:FS00:Upload 89.11%
01:12:57:WU00:FS00:Upload 89.25%
01:13:05:WU00:FS00:Upload 89.45%
01:13:06:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
01:13:07:WU00:Sending unit results: id:00 state:SEND error:NO_ERROR project:17219 run:1068 clone:1 gen:0 core:0xa7 unit:0x0000000100000000000043430000042c
01:13:07:WU00:Uploading 91.67MiB to 128.252.203.11
01:13:07:WU00:Connecting to 128.252.203.11:8080
01:13:14:WU00:Upload 0.20%
01:13:20:WU00:Upload 0.34%
I've repeatedly had to discard results manually, which wastes my machine's time and energy and delays the science. And the repeated attempts to upload results waste the small amount of upload bandwidth I do have -- which is badly needed during lockdown.

Is it possible to stop the collection server from timing out after an hour?

Failing that, is it possible to restrict my client so that it won't download WUs that will produce more than 10 or 20MB of results, so that I can actually upload them within the hour permitted to me?

Is it possible to avoid WUs that use Core A7? From reading around, I understand that that's the core that tends to produce large results.

One thing I'm trying now is to delete my one 20-thread CPU slot (yep, another big WU wastefully discarded) and replace it with two 10-thread slots in the hope that each slot will get smaller jobs. I get fewer PPD that way, and I know that F@H prefers to have one monster slot rather than several smaller ones, but I don't know what else to try.

Thanks for any ideas you can come up with,

Markus
ajm
Posts: 754
Joined: Sat Mar 21, 2020 5:22 am
Location: Lucerne, Switzerland

Re: Recent WUs are too large to upload

Post by ajm »

Maybe hotspotting?
And I heard that, in the past, with dial-up modems, there was a way to get only smaller WUs. You had to add an Expert option "max-packet-size" with value "SMALL". But I don't know if this is still in use.
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: Recent WUs are too large to upload

Post by gunnarre »

There has been an advanced option called "max-packet-size" which could be set to "small", "normal" or "big", but I don't think this option is supported anymore in the newer cores. Might be worth trying?

You can blacklist particular servers by IP address if you want to avoid the servers running A7 core projects. You can do this in your local machine's firewall or your router. Search for "GRO_A7" here
https://apps.foldingathome.org/serverstats
and blacklist the IP addresses in question. A few servers (one at time of writing) have both A8 and A7 projects on them. You probably have to check back on that page periodically to make sure you're not blacklisting GPU or A8 projects in future.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Recent WUs are too large to upload

Post by Joe_H »

The option max-packet-size is still supported, its default value of normal is for return file sizes of up to 25 MB. The main issue we have been running into is researchers who have not set up projects correctly on the servers to require the "big" parameter when the WU takes up that much size on return.

The issue is not whether the project use Core_A7, _A8 or the GPU Core_22, but how big the resulting data file becomes after the processing is completed. I will post a reminder to the person running this project.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: Recent WUs are too large to upload

Post by gunnarre »

So no user intervention needed once that has been fixed - thanks.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
Markus_Laker
Posts: 20
Joined: Sun Dec 01, 2019 11:36 am

Re: Recent WUs are too large to upload

Post by Markus_Laker »

25MB in one hour would be achievable, even when there's a video call going on elsewhere in the house. Many thanks for your help, everyone. I appreciate it!
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Recent WUs are too large to upload

Post by Joe_H »

The setting should be in place now to prevent this project from being assigned to clients with the default setting. For those with faster connections, setting 'max-packet-size' to 'big' should still get you these COVID related WUs.

I checked my logs, last time I got WUs from this project 17219 was a week and a half ago as a beta tester. I did not have any problems uploading, though it did take about 20 minutes over my DSL connection. That is something I try to schedule for the early AM when I am asleep as much as possible.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Recent WUs are too large to upload

Post by bruce »

If you're folding with your CPU (FAhCore A7 or A8) there's another option.

(I'm assuming you've changed the POWER setting to FULL.) There is a wide variety of proteins being folded with a wide range of number of atoms.

See https://apps.foldingathome.org/psummary

There are several different reasons for the size of the upload package, but one of them is the simple number of atoms. Setting preferences that give you smaller proteins would be a place to start.

One thing to try would be to divide up your CPU into several independent "slots" I would not recommend ever using a slot with a single CPU but >=2 might work. Starting from your (12 Threadripper cores, 24 threads), you might try 3 or 4 slots of 6 cores each, perhaps leaving a few unused. Fewer CPUs per slot tends to be assigned smaller proteins and they tend to end up with smaller upload packages.

This is probably not the best way to maximize your PPD, but you need to experiment and see what works best for you.
Markus_Laker
Posts: 20
Joined: Sun Dec 01, 2019 11:36 am

Re: Recent WUs are too large to upload

Post by Markus_Laker »

Thanks, Bruce.

Unfortunately, the problem has recurred with two slots of ten threads each:

Code: Select all

10:03:40:WU04:FS01:Sending unit results: id:04 state:SEND error:NO_ERROR project:17219 run:2652 clone:0 gen:2 core:0xa7 unit:0x00000000000000020000434300000a5c
10:03:40:WU04:FS01:Uploading 83.54MiB to 128.252.203.11
There's no way my puny ADSL connection can upload 83MiB in an hour. I'm going to have to discard that work unit just to unclog my system. Joe_H, could you have another word with the person running this project, please?

Meanwhile, I'll reconfigure my slots again. I'll see what happens with five slots of four threads each.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Recent WUs are too large to upload

Post by Neil-B »

You need to test it of course but there is a danger that having more hopefully smaller WUs running/completing and needing uploading may actually end up needing a greater total upload over time than the single big ones ... You might actually need just to run one or two small slots and not fully utilise your folding power.

It might be quicker and mean less dumped WUs if you start with say a single 4core slot and let that run and see if your network capacity can handle it before trying multiple larger ones.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Recent WUs are too large to upload

Post by bruce »

Were those uploads "clean" or did they contain a multitude of error reports.

If the former, I presume that all completed WUs are over 80 MB and a change to the WU is called for.
If the latter, what can we do to identify why you're getting so many errors.
If neither of the above, perhaps it's because the slot is set to run on idle and there were lots of cycles of Pause/Resume which also causes the log files to grow.
Markus_Laker
Posts: 20
Joined: Sun Dec 01, 2019 11:36 am

Re: Recent WUs are too large to upload

Post by Markus_Laker »

And here comes a third one from the same project that I'll need to discard:

Code: Select all

15:50:16:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:17219 run:2899 clone:1 gen:2 core:0xa7 unit:0x00000001000000020000434300000b53
15:50:16:WU02:FS01:Uploading 83.54MiB to 128.252.203.11
Neil-B, that's an interesting point, but this machine's a couple of years old now. I've been folding more or less flat-out since I bought it, and my Internet connection seems to cope in general, until these 80+MiB uploads turn up.

Bruce, I'm not absolutely sure I understand your question. Large uploads always fail, but it's always because the upload server times out after an hour:

Code: Select all

16:49:56:WU02:FS01:Upload 97.63%
16:50:02:WU02:FS01:Upload 97.78%
16:50:10:WU02:FS01:Upload 98.00%
16:50:16:WU02:FS01:Upload 98.15%
16:50:18:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
16:50:18:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:17219 run:2899 clone:1 gen:2 core:0xa7 unit:0x00000001000000020000434300000b53
16:50:18:WU02:FS01:Uploading 83.54MiB to 128.252.203.11
16:50:18:WU02:FS01:Connecting to 128.252.203.11:8080
16:50:25:WU02:FS01:Upload 0.22%
16:50:31:WU02:FS01:Upload 0.37%
How frustrating is that?

It'd be nice if the timeout mechanism could be a bit cleverer and not close the connection if the upload was obviously still making progress.

I've not interrupted folding at all: the machine has been folding flat-out for the last 20 hours, without interruption, and that's enough time for several WUs per slot, even with only four threads per slot. Plenty of other, smaller uploads have succeeded in that time.

Now, we know that Joe_H asked the project owner to set the "big" parameter on these WUs. I guess it's possible that the project owner has done so, but that the change didn't take effect on WUs that were already queued up, and we're still working through those. Or it's possible that the project owner hasn't quite got round the making the necessary change yet, which would be annoying.
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Recent WUs are too large to upload

Post by Joe_H »

I have asked the researcher to check up on this, the setting was supposed have gone in. Possibly the setting did not take, or it was removed accidentally.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Recent WUs are too large to upload

Post by Neil-B »

I think what bruce was getting at wasn't the upload failing but questioning if during the course of processing these larger upload wus there were any errors that the client/core managed ... When an error occurs mid wu and the core can correct it then the error reports can significantly inflate the size of the final package ... Pausing a wu many times can sometimes have the same effect.

If it is always just this one project (which to be honest has a very large atom count and a very large base credit which means it may well just be naturally big) then it is really waiting for Joe_H's message to hopefully get the project flagged - and yes it may be that there are some pre flagged wus around ... Making the changes may well not be a simple change (it depends on the server admin processes/procedures and availability of researcher) but someone will get to it as soon as they can (if they can).
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Recent WUs are too large to upload

Post by Neil-B »

I'd check what size my uploads for that project are but my server is down for rebuild at the moment so haven't got access to those logs ... If someone else spots this ad is running 17219 wus then perhaps they could confirm the upload sizes they are seeing? :)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Post Reply