Folding@Home RAM usage grows until exceeded, then PC crashes

Moderators: Site Moderators, FAHC Science Team

bollix47
Posts: 2942
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bollix47 »

I too would like to extend my thanks for updating us on the outcome.
I was concerned that a 2-core celeron might not be strong enough to drive 8 GPUs but don't have a lot of experience with AMD GPUs or celeron CPUs. Expanding your memory should help but don't be too surprised if the probem doesn't completely resolve itself. If you had been using nvidia I would have said "No way" will that CPU drive all those GPUs. If you still have problems I would set one GPU to finish, check for ability to use the computer and repeat until you get to a number of GPUs that works well.

fyi .. One RTX 2060 running on linux will deliver ~1M PPD and that's on your existing system without the 460s or the memory upgrade ... two might work if it was dedicated to folding ... just something to think about.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bruce »

MeeLee wrote:...As far as swap is concerned, on 18.10 I run without a swap, but on 18.04 my swap is 2-3GB in size.

If you have 4GB of RAM in Linux (or more), aside from a power loss, or running tens of programs at once, there is no reason for a swap file.
A swap file doesn't use power. As long as you have disk space, it costs nothing to have one ... and if FAH is as memory hungry as it appears to be, which would you rather have, crash your PC or write some info to the swap file? The swap file won't be used unless you need it.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by MeeLee »

If you really want to run all those GPUs, and if you're currently running a Celeron G processor,
I think you'll benefit greatly from at least going quadcore. 8 GPUs might need more like a 2Ghz quadcore cpu.
They're going for a good price on the second hand market.

Can't help but think that the Celeron is mightily overpowered, and that this might be the reason of the high RAM usage (together with the AMD drivers using more RAM).
With a 3Ghz dual core Celeron, I think about 6 older AMD GPUs are really the max.
Dark_Vera
Posts: 12
Joined: Mon Oct 28, 2019 12:31 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by Dark_Vera »

bollix47 wrote:I too would like to extend my thanks for updating us on the outcome.
I was concerned that a 2-core celeron might not be strong enough to drive 8 GPUs but don't have a lot of experience with AMD GPUs or celeron CPUs. Expanding your memory should help but don't be too surprised if the probem doesn't completely resolve itself. If you had been using nvidia I would have said "No way" will that CPU drive all those GPUs. If you still have problems I would set one GPU to finish, check for ability to use the computer and repeat until you get to a number of GPUs that works well.
You might have a point. Allow me to update everyone with what I observed recently.

So I did indeed update my Folding Rig with a second Samsung 8GB module, bringing my total RAM up to 16GB (from 12) running at 1600 Mhz. Unfortunately, Folding@Home sometimes stays below the 15.6 GB limit, but occasionally slips into 17-18GB of consistent RAM usage when running at full hilt....swapping begins.....and when it does, the swapping process appears to bring my Celeron at 98%-100% usage again, bringing the system to a crawl, then eventually freezing/hard crashing.

Sadly, my motherboard was originally designed for mining (it's a Yanyu K37 mining board) and the board cannot support more than 16GB of DDR3L RAM, so bumping it up further is impossible.

I would like to note that my Celeron is able to keep up like a champ at 30-40% CPU usage while all 8 RX460s are folding - CPU usage only spikes when RAM usage exceeds 15.6GB and my swap file is activated.

On a positive note, with 16GB of "hard" memory, F@H's RAM usage takes significantly longer to hit the limit of my system - so the end result is, I can rip all 8 AMD GPUs at full blast, then set F@H to "FINISH," and within a day, I can crunch through 8 folding projects BEFORE I hit the RAM limit, thus avoiding hanging/freezes/crashes successfully.

An oddity I noticed though, was that even after setting F@H to "finish" (and all 8 GPUs finish their respective tasks, then go to Idle), F@H RAM usage is still 14-15GB - if WU are being crunched and sent successfully, why is F@H still hogging my system RAM?

To address the above, every 1-2 days I have to reset the PC, then start F@H again on a fresh batch of WUs. Restarting the PC "clears" the ghost memory usage somehow and allows me to fold in peace again.

bollix47 wrote:fyi .. One RTX 2060 running on linux will deliver ~1M PPD and that's on your existing system without the 460s or the memory upgrade ... two might work if it was dedicated to folding ... just something to think about.
I have strongly been considering this lately. Running just 2-3 ultra powerful GPUs will be less of a headache to maintain for me and more space and heat efficient (though winter is here and I am enjoying the heat that 8 AMD RX460s give off in my tiny home). Currently I am trying to sell my RX460s to gamers, then use the proceeds to buy a new EVGA water-cooled Nvidia 2080. I will put this GPU in my HTPC then use the HTPC as a gaming console and also a winter heat source/aggressive folding rig. The 2080 alone will blow the 8 RX460s out of the water and will need far less resources to run, plus it can game like a beast. What do you think?
bollix47
Posts: 2942
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bollix47 »

Running just 2-3 ultra powerful GPUs ...
Please remember that nvidia GPUs require one CPU core each to support them. So 3 would not work well on a 2 core CPU.
Other than that I think your plans are a win-win for all concerned. :wink:
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bruce »

A lot depends on whether you run Linux or windows. My main machine can boot eithe Windows or Linux (I'm not at home today, so I can't verify every detail.) It has a GTX960 and a GtX750Ti plus an 8-way CPU. I commonly run 3 slots (including 6 virtual CPU cores dedicate to FAH plus often a browser. Those 3 slots generally push me into the paging range. If I stop one slot, there's no paging.

The monitor runs off of the GTX750Ti. In Windows, the screen lags appreciably, but works fine if I pause the WU that's running on the same GPU as the desktop. If I pause the CPU slot or the other CPU slot, there's still a lag, so it's not paging that's causing the screen lag, it's the limitation of sharing the GTX750Ti. If I pause that slot, the browser works fine.

If I switch Windows to CPU rendering, it doesn't help, which surprises me. I guess that's a paging issue. If I could add a 3rd (slow) GPU and dedicate it to the Windows Desktop, it might work fine. (M/B has no more slots except the 1x PCI and I haven't figured out how to use that yet.)

On Linux, I don't notice the same limitations. Then, too, Linux gets better PPD.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by foldy »

I don't know if that is really possible but FAH uses low video RAM on GPU and maybe that could also be used for faster swap.
https://wiki.archlinux.org/index.php/Swap_on_video_RAM
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by MeeLee »

When I set the client to 'full', and use higher than usual, my desktop also feels sluggish.
I think it's because FAH uses up 95-98% of the GPU's resources, with only 1 single buffer resource left for displaying the desktop.
Usually most motherboards only allow to boot from IGP, first or second full size slot. A monitor plugged in a 1x slot (usually slot 3 to 8) might not be recognized as a main monitor, and you might just end up with a functioning pc, but a blank screen.
Then again, if you could run a monitor from a pcie 1x slot, you might as well use that slot for folding as well, and will have to pause it whenever you want to use your pc.

@Foldy: I think it takes more time to load the Vram, back into regular ram, to have it processed by the CPU, to be sent back into the VRAM each time (plus PCIE transactions are the main cause of latency and lower PPD on faster cards).
Also I had to increase my RAM to 8GB, which is the minimum nowadays anyway for most systems.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by foldy »

I meant VRAM speed is faster than old SSD speed for swapping if unavoidable.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bruce »

Fastest: RAM.
Essentially the same: VRAM that doesn't need to swap
Slow: VRAM that does have to swap.

e.g.- allocate space on disk for VRAM. Allocate some memory space for a program that is loaded but idle. Allocate memory space for program(s) that are active, but which would fit in real RAM if you hadn't used some for an inactive program. The inactive program will be moved to disk and the active programs will run at the same speed they would have run had you NOT started the inactive program.

Try the same test without allocating disk for VRAM and Windows will crash.
Dark_Vera
Posts: 12
Joined: Mon Oct 28, 2019 12:31 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by Dark_Vera »

bruce wrote:A lot depends on whether you run Linux or windows. My main machine can boot eithe Windows or Linux (I'm not at home today, so I can't verify every detail.) It has a GTX960 and a GtX750Ti plus an 8-way CPU. I commonly run 3 slots (including 6 virtual CPU cores dedicate to FAH plus often a browser. Those 3 slots generally push me into the paging range. If I stop one slot, there's no paging.

The monitor runs off of the GTX750Ti. In Windows, the screen lags appreciably, but works fine if I pause the WU that's running on the same GPU as the desktop. If I pause the CPU slot or the other CPU slot, there's still a lag, so it's not paging that's causing the screen lag, it's the limitation of sharing the GTX750Ti. If I pause that slot, the browser works fine.

If I switch Windows to CPU rendering, it doesn't help, which surprises me. I guess that's a paging issue. If I could add a 3rd (slow) GPU and dedicate it to the Windows Desktop, it might work fine. (M/B has no more slots except the 1x PCI and I haven't figured out how to use that yet.)

On Linux, I don't notice the same limitations. Then, too, Linux gets better PPD.
I've learned from researching other forums and threads that F@H (and certain BOINC apps) handle CPU resource allocation differently in Linux than under Windows. The main observation has been that F@H REQUIRES 1 CPU core per GPU folding slot, yet under Linux a GPU folding slot requires a fraction of a core (proven also by my rig discussed earlier in this thread, until paging begins).

Additionally, GPU BOINC apps and Folding have far higher performance under Linux than in Windows, which is even more apparent when the GPUs are bottlenecked with PCIe X1 slots (versus the recommended X4, X8 or X16 speeds). My GPUs, even when running one slot at a time under Windows, would choke, as I'm running a K37 mining board with all slots restricted to PCIe X1 speeds. Under Linux, I get roughly 2.5 times the PPD per GPU.

Overall, Linux somehow squeezes superior performance out of GPUs throttled at the PCIe lane level while also using significantly less CPU when running multiple GPUs - a proven fact seen throughout this thread and also from dozens of other discussions online.
Dark_Vera
Posts: 12
Joined: Mon Oct 28, 2019 12:31 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by Dark_Vera »

An oddity I noticed though, was that even after setting F@H to "finish" (and all 8 GPUs finish their respective tasks, then go to Idle), F@H RAM usage is still 14-15GB - if WU are being crunched and sent successfully, why is F@H still hogging my system RAM?

To address the above, every 1-2 days I have to reset the PC, then start F@H again on a fresh batch of WUs. Restarting the PC "clears" the ghost memory usage somehow and allows me to fold in peace again.
Folks, I'm wondering if anyone can comment on this phenomenon I'm seeing here. It's not ending my Folding operations but it's still bizarre and forces a system reboot every day in order to keep things running.

Why is Folding@Home still hogging massive system RAM when even after it finishes work units and F@H Control goes idle?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by bruce »

Each slot runs a single WU, using a single FAHCore -- presently either FAHCore_21 or FAHCore_a7 When those FAHCores finish, they are removed from RAM. As far as other programs are concerned, FAH does not page_out memory unless it's needed by something else. I think "hogging" is the wrong word to use.

On Windows (using the Performance Tab of Task Manager) I can see allocated memory change when a FAHCore starts or finishes. FAHCore_A7 will use 100% of your CPUs if you let it, but since it's a very low priority, it doesn't interfere with foreground activities. Each FAHCore_21 does use on CPU, but that's they way the Windows drivers are constructed by AMD/NV. The Linux FAHCores+Drivers are decidedly different.

FAHClient runs all the time, using very little CPU resources because it's responsible for uploading results and downloading new WUs and then it goes idle (and may be paged out if something else needs the RAM) until it's needed again. FAHControl and FAHViewer should only be used when you feel you need them.

Windows also allocates some RAM for a disk cache, and it doesn't go away -- it's allocation becomes part of the kernel.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by MeeLee »

Dark_Vera wrote:
bruce wrote:A lot depends on whether you run Linux or windows. My main machine can boot eithe Windows or Linux (I'm not at home today, so I can't verify every detail.) It has a GTX960 and a GtX750Ti plus an 8-way CPU. I commonly run 3 slots (including 6 virtual CPU cores dedicate to FAH plus often a browser. Those 3 slots generally push me into the paging range. If I stop one slot, there's no paging.

The monitor runs off of the GTX750Ti. In Windows, the screen lags appreciably, but works fine if I pause the WU that's running on the same GPU as the desktop. If I pause the CPU slot or the other CPU slot, there's still a lag, so it's not paging that's causing the screen lag, it's the limitation of sharing the GTX750Ti. If I pause that slot, the browser works fine.

If I switch Windows to CPU rendering, it doesn't help, which surprises me. I guess that's a paging issue. If I could add a 3rd (slow) GPU and dedicate it to the Windows Desktop, it might work fine. (M/B has no more slots except the 1x PCI and I haven't figured out how to use that yet.)

On Linux, I don't notice the same limitations. Then, too, Linux gets better PPD.
I've learned from researching other forums and threads that F@H (and certain BOINC apps) handle CPU resource allocation differently in Linux than under Windows. The main observation has been that F@H REQUIRES 1 CPU core per GPU folding slot, yet under Linux a GPU folding slot requires a fraction of a core (proven also by my rig discussed earlier in this thread, until paging begins).

Additionally, GPU BOINC apps and Folding have far higher performance under Linux than in Windows, which is even more apparent when the GPUs are bottlenecked with PCIe X1 slots (versus the recommended X4, X8 or X16 speeds). My GPUs, even when running one slot at a time under Windows, would choke, as I'm running a K37 mining board with all slots restricted to PCIe X1 speeds. Under Linux, I get roughly 2.5 times the PPD per GPU.

Overall, Linux somehow squeezes superior performance out of GPUs throttled at the PCIe lane level while also using significantly less CPU when running multiple GPUs - a proven fact seen throughout this thread and also from dozens of other discussions online.
I'd like to correct that,
For full speed results, both Windows and Linux, require one core per GPU, if the GPU is Nvidia.
The difference between Nvidia and AMD GPUs, is that both GPUs (should) use about the same CPU load, when the GPUs are similar in performance.
However the CPU for AMD GPUs will show the CPU load, while NVIDIA drivers will fill CPU passive time with idle data.
That being said, if you have a 4Ghz CPU, you could easily share a CPU core with 2x RTX 2060 or 2070 GPUs; since the CPU's idle data can easily be allocated to the second GPU, just like on AMD.
The difference is that now each GPU runs a bit slower; just like AMD with the AMD drivers. The idle data Nvidia drivers send over the CPU, is actually helping the GPU for higher performance.
If however, you have more GPUs than CPU cores, and your CPU is fast enough, you can split one CPU core with 2 GPUS (or 3 GPUs on 1 CPU core that supports hyperthreading).
The phenomenon '1GPU per CPU core' is true for Nvidia drivers, and is true for both Windows and Linux.
Dark_Vera
Posts: 12
Joined: Mon Oct 28, 2019 12:31 pm

Re: Folding@Home RAM usage grows until exceeded, then PC cra

Post by Dark_Vera »

Gentlemen,

I hate to bump this again but I have some interesting and positive developments regarding my original issue, which may help others coming across this problem in the future.

I believe I have finally permanently solved my growing RAM usage and RAM "hogging" problem under Manjaro LXDE Linux (ArchLinux).

As of this writing, all 8 of my GPUs are roaring at full speed (though sadly bottlenecked at the PCIe level by my X1 slot speeds) providing me a steady 780K PPD. My CPU usage hasn't exceeded 32%, which is impressive given that I'm running an old Kaby Lake Celeron, and my RAM usage has not exceeded 3.62 GB, even with all 8 slots filled and running FahCore21. The rig's folding, RAM usage, and CPU usage have been stable for the past 8 hours and no swap paging has occurred.

Here is what I did today that may have provided the above solution, by accident:
1) I physically removed and reinstalled some of my GPUs due to an unrelated maintenance issue
2) I reset my CMOS battery and thus reset the BIOS to default settings due to an unrelated issue
3) Upon booting into my OS, I noticed that Manjaro had a slew of package updates waiting for me (including for my AUR packages, which may have included the ArchLinux OpenCL add-on for the AMD Open Source driver (AMDGPU)). I installed them all.
4) Folding@Home would not run due to a GPU index error; in order to resolve, I completely removed the AUR Folding@Home package and reinstalled it from scratch
5) I refreshed the GPUs.txt file with the latest list from Stanford
6) Folding@Home's new install did not detect my GPUs automatically, so I deleted the default CPU slot, and reassigned each GPU by hand, assigning each GPU slot #s from 0-7
7) Some of the FAH slots didn't fold, so I went back and manually set the OpenCL index values of each GPU slot to match their GPU slot # (0, 1, 2.....7)
8) I set in my preferences to Fold from "All" causes to just "Cancer"
9) Voila, all slots folding at full power, RAM usage stable at 2.5GB - 3.62GB for 8 hours, CPU usage never exceeding 32%.

My theory regarding the original problem is that I was running a set of old packages on my relatively new Kernel 5.3, which may have compromised FAH's ability to allocate memory, and/or had the AMDGPU driver and its AUR OpenCL add-on package going haywire and causing a memory leak. I think that the problem was at the OS level due to the fact that after completing the "phat" Manjaro updates and rebooting, an annoying glitch that was affecting my Ethernet adapter was suddenly gone: without any other explanation. Perhaps the sweeping updates that I performed also affected the AMD driver, its OpenCL functionality and even the FAH package itself...

I am now going to run this rig for the next week non-stop to check for memory leaks and if CPU usage stays relatively low. If anything bad happens I will update you all again, but after 8 hours of monitoring stable operation, I am thinking that the issue is finally closed.
Post Reply