Con Kolivas patchset reborn, uses 100% CPU on our client

Moderators: Site Moderators, PandeGroup

Con Kolivas patchset reborn, uses 100% CPU on our client

Postby ThunderRd » Fri Sep 04, 2009 5:21 pm

My recent problems with the Real-Time Kernel patchset on the current 2.6.30 kernel http://foldingforum.org/viewtopic.php?f=44&t=11149 have actually paid off. I accidentally ran across the news that Con Kolivas is coding again and has released a new scheduler in these patches for 2.6.30 just in the last few days:

http://ck.kolivas.org/patches/bfs/

His work was in active development until 2.6.22 when he took a 2-year break from kernel development after a falling out with the kernel devs [and Linus]. Their return has been good news for a group of dedicated followers.

Last night I built a new kernel for two of my machines and was pleasantly surprised to see that it has none of the problems of the RT-Preempt patch, and more importantly, when it runs the SMP folding client it absolutely pegs the CPU usage at a full 100% on all cores. This is on an Opteron-185 dual core and a Q6600 Quad. This is far and away the most complete CPU usage I have seen from any kernel, bar none.

I am not a Linux guru, but I have done a fair amount of experimentation with kernels and would say that this patchset is worth a try for anyone here wanting to get more than that 90-95% usage that most of us see on fahcore_a2.

I'd be curious if anyone else gets similar results.
ASUS Maximus Extreme X38 - QX9650@4.2G - 8G Corsair Dominator DDR3-2000 - GTX470 - Win7 Pro, Driver 305.68 running GPU3 + SMP
ASUS P5Q Pro Turbo P45 - Q6600@3.5G - 4G HyperX DDR2-1066 - GT440 - Gentoo/aptosid, Driver 304.51 running GPU3 [in WINE] + SMP
ThunderRd
 
Posts: 123
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby bollix47 » Fri Sep 04, 2009 6:10 pm

it absolutely pegs the CPU usage at a full 100% on all cores


This might be a result of using the new core 2.10 as all my Q6600s are showing 100% CPU usage and I'm using Ubuntu 8.04 .... ie kernel 2.6.24. Prior to 2.10 the CPU usage always averaged around 90-95%.
bollix47
Site Moderator
 
Posts: 2807
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby Ant P » Fri Sep 04, 2009 6:21 pm

I noticed it too, here. As of now I have a 3 core CPU maxed out, and I'm finishing SMP WUs in 12 hours instead of 18(!)

It's not the 2.10 core, I've been running this kernel for days before I got that upgrade.
Ant P
 
Posts: 24
Joined: Sat Aug 29, 2009 1:04 pm

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby ThunderRd » Sat Sep 05, 2009 9:58 am

Yep, I'd agree with Ant P, I had the 100% usage before I received the new core as well. Frame times are off in the neighborhood of 7% on 2669. I hope others will take a look at this. Also, I emailed Con K to ask a question and was pleasantly surprised to get a response several hours later. This is a Good Thing.
ThunderRd
 
Posts: 123
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby JackOfAll » Sun Sep 06, 2009 10:42 am

Frame time reduced by 2 mins on my O/C'd 3.8Ghz i7 920 (-bigadv -smp 8) with the bfs scheduler. If anyone else would like to try this with a Fedora 11 x86_64 install, you're welcome to use my pre-built kernel RPMS based on kernel-2.6.30.5-43 from Fedora updates-testing repo plus the bfs patch. (bfs patch V209. kernel-2.6.30.5-43.sched_bfs_209.fc11.x86_64.rpm.)

Code: Select all
rpm -Uhv http://www.vacuumtube.org.uk/folding/fedora/11/x86_64/folding-release-1-2.noarch.rpm
yum --enablerepo=folding-testing update kernel\*


Need to run both commands as root (sudo or 'su -c' or just plain be root) and accept the cert when it asks you. You might also want to edit '/etc/grub.conf', prior to rebooting with the new kernel installed. I suggest you locate 'timeout=0' and change it to 'timeout=5'. That will give you a 5 second countdown timer every boot, that you can interrupt by pressing a key and choose a specific kernel version to boot. ie. if you have a problem with the bfs kernel, reboot and choose the second kernel in the list, the one that you were running prior to installing the bfs kernel.

I would also suggest you install schedtool.....

Code: Select all
yum install schedtool


Start the FaH client with ......

Code: Select all
schedtool -D -e ./fah6 ........


or

Code: Select all
schedtool -B -n 19 -e ./fah6 .......


The '-D' argument to schedtool instructs the kernel to apply IDLE priority. ie. very low priority, lower than nice 19. The '-B -n 19' tells the kernel scheduler to start nice'd to 19 and BATCH priority, which slightly disadvantages scheduling compared to other processes with the same dynamic nice priority. If using a dedicated folding machine use '-B -n 19', or if also using the Linux box for other tasks, such as running an interactive desktop, start with '-D' argument to schedtool. The '-B -n 19' is worth about 200 extra PPD but the desktop will feel slightly less responsive. But then again, no less responsive than it would have seemed anyway with the default cfs scheduler. YMMV.
Folding on Linux - Fedora 11 x86_64 / nVidia 180.60 driver / CUDA 2.1
Image
User avatar
JackOfAll
 
Posts: 83
Joined: Tue Mar 17, 2009 3:40 pm

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby ThunderRd » Sun Sep 06, 2009 11:47 am

I'm stunned that this topic hasn't generated much interest.

It doesn't take much to compile a new kernel, boys. I know it's not for everyone, but for those willing to try it out, it seems to be well worth the effort. Maybe the biggest single production increase I have found in two years of SMP folding.

@JackOfAll
SCHED_IDLEPRIO didn't work for me as advertised in the FAQ, in fact it used less CPU than running without schedtool. I wrote to Con about this and he has said it appears to be a bug and will address it later after larger fish are fried. In the meantime, running the client without schedtool has been fine for me, 99-100% constantly, and the desktop doesn't seem to suffer at all, afaict.
ThunderRd
 
Posts: 123
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby JackOfAll » Sun Sep 06, 2009 12:11 pm

ThunderRd wrote:SCHED_IDLEPRIO didn't work for me as advertised in the FAQ, in fact it used less CPU than running without schedtool.


Actually, if I understand this correctly, running with 'schedtool -D' should use less CPU than running without schedtool, (assuming other processes want CPU time), by design. Without setting the kernel scheduler static priority, the lowest nice'd dynamic priority is 19. IDLEPRIO is lower than that. (Or should be.) ;)

ThunderRd wrote:I wrote to Con about this and he has said it appears to be a bug and will address it later after larger fish are fried. In the meantime, running the client without schedtool has been fine for me, 99-100% constantly, and the desktop doesn't seem to suffer at all, afaict.


I can't be 100% certain of this, but I recall seeing code after the 205 patch (might have been 207) that would have addressed the priority with IDLEPRIO, maybe this was what he was talking about. The only reason why I suggested IDLEPRIO or BATCH with '-n 19' was that Fedora out of the box has a few stupid defaults, like not using RT for pulseaudio. I sure don't want to support people complaining, "my audio is stuttering when I run this kernel with the bfs patch"........ ;)
User avatar
JackOfAll
 
Posts: 83
Joined: Tue Mar 17, 2009 3:40 pm

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby Galahad » Sun Sep 06, 2009 2:56 pm

ThunderRd wrote:I'm stunned that this topic hasn't generated much interest.


I'd be happy to know how to apply this patch. In steps please.
I know the pieces fit
User avatar
Galahad
 
Posts: 104
Joined: Sun Dec 02, 2007 6:46 am
Location: Russia

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby ThunderRd » Sun Sep 06, 2009 4:12 pm

Galahad wrote:I'd be happy to know how to apply this patch. In steps please.


Have you ever compiled a custom kernel for your rig before? If not, there are many how-tos on the web, probably one is already written for your particular distro. Study it. There will be a step in the process where it tells you to apply any patch into the source directory. That's where you do it; with a command like this (as root):

Code: Select all
patch < 2.6.30-sched-bfs-208.patch -p1


After that, you can import your current .config:

Code: Select all
cp /boot/config-$(uname -r) .config && yes "" | make oldconfig


Then you can go ahead and run xconfig or menuconfig, make your selections, and compile. There are some suggestions for some of the .config options in the FAQ; it's available in the download site

If none of this means anything to you, it's better to study a detailed explanation of the process before doing it the first time.
ThunderRd
 
Posts: 123
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby Galahad » Sun Sep 06, 2009 5:44 pm

As far as I understand, the exact kernel to apply this patch is linux-2.6.30.tar.bz2 ?
User avatar
Galahad
 
Posts: 104
Joined: Sun Dec 02, 2007 6:46 am
Location: Russia

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby tear » Sun Sep 06, 2009 10:22 pm

Nope, it doesn't do the trick here --

2xE5530 (rev B0), A2 2.08, -smp 15, P2681 (R0, C14, G29):
stock 2.6.30.4, kernel mode preemption off, dynticks on, hz 1000, CPU & memory affinities set -- 26m55s*
bfs-209 2.6.30.4, kernel mode preemption off, dynticks off, hz 1000, no adjustments to CPU/mem affinities -- 31m30s
bfs-209 2.6.30.4, kernel mode preemption off, dynticks off, hz 1000, CPU & memory affinities set -- 27m00s

It's possible bfs improves behavior (vs. stock) when CPU/mem affinities are unset but that is really N/A to me.


tear

P.S.
*) rev D0 E5530s do 25m40s
One man's ceiling is another man's floor.
Image
tear
 
Posts: 924
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby JackOfAll » Sun Sep 06, 2009 11:49 pm

tear wrote:It's possible bfs improves behavior (vs. stock) when CPU/mem affinities are unset but that is really N/A to me.


Would you mind elaborating on exactly what you're doing with regard to CPU/mem affinities? (Using numactl and then locking the 15 fah cores individually to CPU cores, or just starting launching the fah6 with taskset -c ....... ????) Hope you don't mind. I'm curious as I'm currently building a new rig with a pair of W5580's that will run the bigadv.
User avatar
JackOfAll
 
Posts: 83
Joined: Tue Mar 17, 2009 3:40 pm

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby toTOW » Mon Sep 07, 2009 1:18 am

I'd try this patch tomorrow on an Ubuntu VM ... we'll see if that helps.
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 7999
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby ThunderRd » Mon Sep 07, 2009 1:52 am

You've gotta read the FAQ. He gives some suggestions there, here is a bit of it. Note the first line:

Configure your kernel with 1000Hz, preempt ON and disable dynamic ticks.

You shouldn't need to tune BFS virtually ever. The only tunable for the
scheduler itself is the rr_interval value (see documentation). Try 3ms if
latency is everything to you. When compiling software, do not use more jobs
than you have CPUs! So make -j2 on dual core, -j4 on quad core and so on.
Nice levels are strictly obeyed so if you nice your compiles they'll be
virtually unnoticeable. (nice -n 19 make -j2). Run your distributed computing
clients SCHED_IDLEPRIO (eg folding at home, mprime etc):
schedtool -D -e mprime
Run your audio and video apps SCHED_ISO:
schedtool -I -e amarok


@Galahad re: 2.6.30
Yes.

That being said, using schedtool wasn't for me, the CPU usage went WAY down. Without it, however, it's higher than I have ever seen.
ThunderRd
 
Posts: 123
Joined: Sun Dec 02, 2007 5:30 am
Location: Nong Khai, Thailand

Re: Con Kolivas patchset reborn, uses 100% CPU on our client

Postby tear » Mon Sep 07, 2009 2:24 am

JackOfAll wrote:
tear wrote:It's possible bfs improves behavior (vs. stock) when CPU/mem affinities are unset but that is really N/A to me.


Would you mind elaborating on exactly what you're doing with regard to CPU/mem affinities? (Using numactl and then locking the 15 fah cores individually to CPU cores, or just starting launching the fah6 with taskset -c ....... ????) Hope you don't mind. I'm curious as I'm currently building a new rig with a pair of W5580's that will run the bigadv.


I don't mind at all, happy to see someone who's interested.

Anyway, it's something along the former...

First off, what does not work (by design):

1) numactl is useless as it doesn't allow you to tie FahCore worker threads to specific memory nodes -- with numactl you can run
  all worker threads on nodes 0,1,2,3 but you cannot (for instance) make workers 0-3 *only* use node 0, workers 4-7 *only* use
  node 1, etc.

2) launching whole client with taskset -c is useless pretty much the same way (assuming you want to run one client on a machine);
  it just can't do any good.

And yes, managing affinities is something that makes sense once all worker threads has started.
Going over every single one and firing taskset -pc target-cpu pid is the way to go; one must be cautious though --
FahCore instances are also multithreaded and PIDs off top's output correspond to threads that do not actually
perform simulation (press capital 'H' in top and compare PIDs -- you'll see what I mean).

Memory affinity is a bit more tricky -- there's a "page migration" feature in kernel but the only interface (I've found)
that can be used to move process' pages (once it's started) is "cpuset" filesystem; I create a small hierarchy of cpusets;
parent contains all processors and memory nodes for FAH to use; each child corresponds to a different memory node
and only contains processors closest to its node. Clients is started in "parent" cpuset and once all FahCores are forked
they are moved to appropriate "child" cpusets.

For instance, on abovementioned dual E5530 setup structure looks like this:
Code: Select all
[fah@octopus cpuset]$ pwd
/dev/cpuset
[fah@octopus cpuset]$ find all0  -maxdepth 1 -mindepth 1 -type d
all0/node-0
all0/node-1
[fah@octopus cpuset]$ cat all0/cpus
0-15
[fah@octopus cpuset]$ cat all0/mems
0-1
[fah@octopus cpuset]$ cat all0/node-0/cpus
0-3,8-11
[fah@octopus cpuset]$ cat all0/node-0/mems
0
[fah@octopus cpuset]$ cat all0/node-1/cpus
4-7,12-15
[fah@octopus cpuset]$ cat all0/node-1/mems
1
[fah@octopus cpuset]$


Details on cpusets can be found there: http://git.kernel.org/?p=linux/kernel/g ... 54;hb=HEAD

I've released toys I came up with -- you can find URL there: viewtopic.php?f=55&t=11062&start=30#p109342
It's not the latest release (I've added one NUMA-related feature since then) but they should work as expected.

That's roughly it...


tear
tear
 
Posts: 924
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Next

Return to Linux CPU V6 Client

Who is online

Users browsing this forum: Google [Bot] and 1 guest