Core A2 2.10 performance "fix" [for Dual-Core setups]

Moderators: Site Moderators, PandeGroup

Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby tear » Fri Oct 30, 2009 6:13 pm

Updated: November 4th, 2009

Posting here for easy access now that we've polished the guidelines -- extra credit goes to weedacres and uncle fuzzy -- thanks!

There's a performance degradation in A2 2.10 core people have been observing on Dual-Core setups.
Degradation varies from several to thirty (30) percent (compared to A2 2.08). Details in a very, very long thread.

It's been determined that adding MPICH_NO_LOCAL=1 environment variable makes the degradation go away for majority of users.

How to apply?

1) Notfred's (VM or not), option A (easy application, extra action required at every VM/PC boot)

 Please do the following when booting:

Image

 NOTE:  MPICH_NO_LOCAL=1 has to be added at every VM/PC boot-up; if you'd like to avoid it please see
     option B below
 NOTE:  if you ever decide to start client manually you will need to export the variable prior to starting the client
     (see details)


2) Notfred's (VM appliance only but see notes below), option B (not-so-easy application, zero maintenance)

 Please do the following:

Image


 Then reboot the VM. After the reboot you may wish to confirm you did all right:

Image

 NOTE:  once you modify syslinux.cfg file, MPICH_NO_LOCAL=1 will be applied by the bootloader automagically
     at every boot-up
 NOTE:  if you ever decide to start client manually you will need to export the variable prior to starting the client
     (see details)
 NOTE:  make sure to create a backup copy of your whole VM before performing operations described here --
     a mistake may render your VM unable to boot
 NOTE:  if you use USB notfred's you'll need to locate syslinux.cfg file and add MPICH_NO_LOCAL=1 at the end
     of APPEND line(s)


3) Regular Linux installation

 Please do the following:
Code: Select all
export MPICH_NO_LOCAL=1

 and start the client normally.


 If you had followed Stanford's Linux Installation Instructions for the SMP client and created "fah" script, revised instructions are located below:
Code: Select all
echo "export MPICH_NO_LOCAL=1" > fah
echo "./fah6 -smp -verbosity 9 $* &" >> fah
chmod +x fah



4) Additional information

 With MPICH_NO_LOCAL=1 FahCore uses non-localhost IP address for communication;
 what it means is you need to make sure this address (check ,,netstat -tnp | grep FahCore''
 when folding) is not permanently taken down.
 On typical Linux installation (with NetworkManager) that could happen if somebody
 permanently (2h+) disconnected Ethernet cable or DHCP server assigned you a different
 IP address.

 What does the variable do? From mpich README:
Code: Select all
Shared-memory optimizations are enabled by default to improve
performance for multi-processor/multi-core platforms. They can be   
disabled (at the cost of performance) either by setting the
environment variable MPICH_NO_LOCAL to 1, (...)

 <side note>Don't let "at the cost of performance" scare you, just try it out.</side note>

 On the system side it makes FahCore go back to TCP sockets (by default A2 2.10
 uses UNIX domain sockets) for inter-process communication.

 Now.. why it helps? It's really hard to guess given black box nature of FahCore; it
 also goes beyond my present knowledge. Whatever it is, it messes with kernel
 scheduling (kernel code path with more might_sleep()s or something) so processes
 that have actual work to do get CPU time more often than others.

 If previous paragraph made you think Linux is broken -- please don't. It's not.
 By adding discussed variable we're just hiding original problem that actually
 lies in the FahCore.



Thanks everyone for testing and feedback!


tear
Last edited by tear on Thu Nov 05, 2009 4:04 am, edited 3 times in total.
One man's ceiling is another man's floor.
Image
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby weedacres » Fri Oct 30, 2009 10:22 pm

Great Job tear!!

I'm still trying to decide if this improves anything on 4 core folding. My 4 core machines vary +- 300ppd depending where they are in the wu and the phase of the moon so it's hard to tell if anything has gotten better (or worse). My sense is that with 2.10 we lost about 4% on the quads but I can't prove it. I'm using your fix on them as well just to be on the safe side.

Any thoughts on this?

I still think 2.10 was a net gain as I was seeing about a 30% work unit failure rate at one point. 2.10 fixed all of that.
Image
weedacres
 
Posts: 394
Joined: Mon Dec 24, 2007 11:18 pm
Location: Eastern Washington

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby FlipBack » Sun Nov 01, 2009 12:21 pm

I run notfreds, but when I start the clients I never get that boot screen. That only happens on the first time I run the client for me. How can I do this?

Thanks for the fix tear.
FlipBack
 
Posts: 31
Joined: Sat Aug 29, 2009 11:44 pm

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby weedacres » Sun Nov 01, 2009 2:24 pm

FlipBack wrote:I run notfreds, but when I start the clients I never get that boot screen. That only happens on the first time I run the client for me. How can I do this?

Thanks for the fix tear.

I've noticed this on Player 3, not always though. There seems to be a extended time of black screen on player 3 but you should end up at the point where the console says to enter "root" to log in.
Once there you should be able to apply the fix as described here and many other places:
http://foldingforum.org/viewtopic.php?f=44&t=11367&start=210#p117230
weedacres
 
Posts: 394
Joined: Mon Dec 24, 2007 11:18 pm
Location: Eastern Washington

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby harlam357 » Sun Nov 01, 2009 10:14 pm

How does one apply this technique to a Ubuntu VM?
User avatar
harlam357
 
Posts: 228
Joined: Fri Jun 27, 2008 11:03 pm
Location: Alabama - USA

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby ChasR » Sun Nov 01, 2009 10:36 pm

As I understand things, it should work to type export MPICH_NO_LOCAL=1 in the terminal before starting FAH.
Image
User avatar
ChasR
 
Posts: 698
Joined: Sun Dec 02, 2007 5:36 am
Location: Atlanta, GA

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby harlam357 » Mon Nov 02, 2009 12:01 am

It would help if I were to read the entire post. ;)
User avatar
harlam357
 
Posts: 228
Joined: Fri Jun 27, 2008 11:03 pm
Location: Alabama - USA

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby geokilla » Tue Nov 03, 2009 4:09 pm

tear wrote:*) If you're a proficient notfred's user you can add MPICH_NO_LOCAL=1 at the end
  of APPEND line in fold64 section (so you don't need to enter it manually at every
  boot) of syslinux.cfg file

**) Though if you decide to start client manually you will need to export the variable
   prior to starting the client (see details)

So where do I add the MPICH_NO_LOCAL=1 so that it can do that by itself everytime the client automatically starts? I tried looking for the syslinux.cfg file but had no luck. I'm either looking in the wrong place or said file doesn't exist, which I doubt.
Intel Core i5-3570K @ 4.2Ghz @ 1.16V (CPU-Z + LinX)
Gigabyte Z77X-UD5H F14 BIOS
XFX 7950 Double D 3GB (TDKC); VDCC @ 1.125VDCC 1000/1500
Kingston HyperX DDR3 8GB @ DDR-1333
Corsair AX750
Samsung 840 Pro 128GB and Western Digital Black 1TB + Misc HDD
geokilla
 
Posts: 157
Joined: Sun Mar 08, 2009 4:36 am
Location: Toronto, Canada

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby tear » Tue Nov 03, 2009 4:33 pm

geokilla wrote:
tear wrote:*) If you're a proficient notfred's user you can add MPICH_NO_LOCAL=1 at the end
  of APPEND line in fold64 section (so you don't need to enter it manually at every
  boot) of syslinux.cfg file

**) Though if you decide to start client manually you will need to export the variable
   prior to starting the client (see details)

So where do I add the MPICH_NO_LOCAL=1 so that it can do that by itself everytime the client automatically starts? I tried looking for the syslinux.cfg file but had no luck. I'm either looking in the wrong place or said file doesn't exist, which I doubt.


My VM appliance has it in "hard drive's" root directory:
Code: Select all
mkdir /hda1
mount /dev/hda1 /hda1
vi /hda1/syslinux.cfg
# make changes and save the file
umount /hda1



tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby tear » Tue Nov 03, 2009 5:01 pm

And alternative way -- without vi (yay! -- no manual editing):

Code: Select all
mkdir /hda1
mount /dev/hda1 /hda1
sed -i -e 's+APPEND.*$+& MPICH_NO_LOCAL=1+' /hda1/syslinux.cfg
umount /hda1


Turn VM off and make its copy before you do that though (a typo can be disastrous).


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby geokilla » Wed Nov 04, 2009 2:32 am

I don't understand....

Btw, I posted on HWC your "fix". I gave you credit for it and nothing's been edited or anything like that. Hope you don't mind.
geokilla
 
Posts: 157
Joined: Sun Mar 08, 2009 4:36 am
Location: Toronto, Canada

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby ChasR » Wed Nov 04, 2009 3:26 am

tear,
Thank you from Team 32.
User avatar
ChasR
 
Posts: 698
Joined: Sun Dec 02, 2007 5:36 am
Location: Atlanta, GA

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby SKeptical_Thinker » Wed Nov 04, 2009 3:58 am

I highly recommend this product and/or service.

It worked for me. Two dual core SMP systems went from a peak of ~2700 PPD to a peak of ~3300PPD.

The script I use is:

Code: Select all
#!/bin/sh
export MPICH_NO_LOCAL=1
./fah6 -smp -verbosity 9  &
Image
SKeptical_Thinker
 
Posts: 254
Joined: Tue Apr 29, 2008 11:02 pm

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby Pick2 » Wed Nov 04, 2009 4:36 pm

I have a problem with the way I use the "export MPICH_NO_LOCAL=1" fix. I ssh into My headless Linux blades from a Mac Mini. This creates a "Session". As long as I "export MPICH_NO_LOCAL=1" and start the FAH client in that session, all works well. When I ssh in again ,it's a new session , and if I need to restart FAH , I need to first export again for it to take. If I just type "export" it will show a list and I can see I need to reenter the fix.
HTH
Pick2 AKA Plutronics
Pick2
 
Posts: 167
Joined: Fri Feb 13, 2009 12:38 pm
Location: USA

Re: Core A2 2.10 performance "fix" [for Dual-Core setups]

Postby Oak37 » Wed Nov 04, 2009 5:59 pm

Pick2 wrote:I have a problem with the way I use the "export MPICH_NO_LOCAL=1" fix. I ssh into My headless Linux blades from a Mac Mini. This creates a "Session". As long as I "export MPICH_NO_LOCAL=1" and start the FAH client in that session, all works well. When I ssh in again ,it's a new session , and if I need to restart FAH , I need to first export again for it to take. If I just type "export" it will show a list and I can see I need to reenter the fix.
HTH
Pick2 AKA Plutronics

It would probably be easier if you just included all the commands in a shell script like SKeptical_Thinker's one above your post.
Image
User avatar
Oak37
 
Posts: 134
Joined: Tue Dec 04, 2007 6:21 pm
Location: Ireland

Next

Return to Linux CPU V6 Client

Who is online

Users browsing this forum: No registered users and 1 guest

cron