6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Moderators: Site Moderators, PandeGroup

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby hus » Sun May 30, 2010 6:45 am

I haven't seen "A3-returns-immediately-with-code-0" issue at my end.
That's what I get with the system at 2.12 and a patched 2.11.x. Did you try that?

MfG, Ulrich
hus
 
Posts: 11
Joined: Mon May 10, 2010 2:28 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby tear » Sun May 30, 2010 6:49 am

hus wrote:
I haven't seen "A3-returns-immediately-with-code-0" issue at my end.
That's what I get with the system at 2.12 and a patched 2.11.x. Did you try that?

Per one of my posts above -- tried with patched 2.11.90 from Fedora 13 Beta (as you suggested) -- that worked fine. I did not try with 2.11 from Fedora 12.

Kris
One man's ceiling is another man's floor.
Image
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby hus » Sun May 30, 2010 11:11 am

And now for something completely different: nscd!

After you mentioned it, I fired it up - I had it disabled and wanted to check whether it really works with the "wrong" glibc. Well, it does. Not only that, it fixes the "Could not CosmHTTPOpen" problem. I just checked a second machine: now happily folding with glibc 2.12, without a patched library, and without problems. Oops ...

MfG, Ulrich
hus
 
Posts: 11
Joined: Mon May 10, 2010 2:28 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby tear » Sun May 30, 2010 1:28 pm

Now that is _the_ post of the year, sir! Let me go back to my hole...

Never though it would have make a difference. It works here on F12*. I'll check F13 shortly (hell, maybe it will even affect the RH bug I filed -- not that there's no bug in general, but still).

*) it even works with following entry in /etc/nsswitch.conf:
Code: Select all
hosts:      files dns
Could it be that nscd operates outside NSS? Or.. I don't know. Whatever :)

Congratulations!


Kris
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby hus » Sun May 30, 2010 1:48 pm

tear wrote:Could it be that nscd operates outside NSS?
Not quite ... it's some kind of backend magic built into the lookup routines - after all nscd ist part of the same glibc as all the resolver stuff. So if it's available, gethostbyname contacts the daemon (via unix socket), instead of doing the resolution itself. E.g. fah6 opens a socket - which never was a problem - to the fully dynamic nscd executable, which in turn is able to use the library routines. Calling those from the more-or-less static fah6 still is broken.

MfG, Ulrich
hus
 
Posts: 11
Joined: Mon May 10, 2010 2:28 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby tear » Sun May 30, 2010 1:53 pm

Ah, it's just generic caching. So nsswitch.conf is still taken into consideration but no libnss_*.so are ever loaded, correct?

Kris
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby tear » Sun May 30, 2010 1:54 pm

That was meant to say "ever loaded by the client" ^^

BTW, would you care to do a small write-up (howto) on nscd workaround?

And of course, it was a pleasure working with you.


Thanks,
Kris

P.S.
Nice MP reference.
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby hus » Sun May 30, 2010 3:18 pm

tear wrote:So nsswitch.conf is still taken into consideration but no libnss_*.so are ever loaded, correct?
That is my understanding; a quick strace seems to confirm it.

would you care to do a small write-up (howto) on nscd workaround?
Very small - "Discard lib64-fah, enable the nscd service" ...

Ok, how about this:
The fah6 binary is statically linked. Due to the fact that even so some parts of the c library are dynamically loaded at runtime, it is somewhat incompatible with newer versions of glibc. Especially the domain name resolution does not work, which prevents contacting the assignment server and getting any WUs; the symptom of this is the error message "Could not CosmHTTPOpen".
There is a workaround for this which creates a patched version of glibc; that stopped working as of glibc 2.12. Symptoms include segfaults, floating point exceptions or the core exiting with status 0.
But there is another workaround: glibc comes with a caching daemon, called nscd, which communicates with the applications via socket and does the resolutions on their behalf. glibc, namely gethostbyname, automatically uses this daemon if it is running. So to get fah6 working, you just have to enable nscd. For Fedora (and similar systems) this is done by
Code: Select all
chkconfig nscd on
; if you don't want to restart your system, also do
Code: Select all
service nscd start
. Or use your system's gui to do the same.
If you have used the patched library, delete it and remove the LD_LIBRARY_PATH from the call to fah6.

This applies to distributions which include nscd, as Fedora does. As it is part of glibc it should at least be available, you may have to install an extra package.

it was a pleasure working with you.
Same here - thanks for all your work.

MfG, Ulrich
hus
 
Posts: 11
Joined: Mon May 10, 2010 2:28 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby LinuxDonald » Sun May 30, 2010 7:16 pm

Hey cool. Thanks now it works here unter Fedora 13 :)
LinuxDonald
 
Posts: 6
Joined: Sun Jul 19, 2009 3:43 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby gizmo » Sun May 30, 2010 10:24 pm

Confirmed here, works under Fedora 13.

Thanks for all the detective work guys!

As an aside, I've also got this working on my hardened Gentoo system just fine. it is running glibc 2.10.1.
gizmo
 
Posts: 34
Joined: Mon Sep 21, 2009 1:35 am

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby bruce » Sun May 30, 2010 11:27 pm

Just trying to understand what you folks have discovered. . . .

I may be completely off-base here but newbies who don't understand something have only themselves to blame if they never learn anything because they don't want to look like a newbie.
nscd caches DNS lookup requests, right?

So a short TTL timeout together with a slow DNS lookup (without nscd) would probably be equally slow the next time, resulting in a CosmHTTPOpen error after a certain number of retries, right? With nscd, eventually one of the DNS lookups succeeds an then the next retry finds it in cache.

Does that make any sense?
bruce
 
Posts: 22249
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby gizmo » Mon May 31, 2010 12:43 am

bruce wrote:Just trying to understand what you folks have discovered. . . .

I may be completely off-base here but newbies who don't understand something have only themselves to blame if they never learn anything because they don't want to look like a newbie.
nscd caches DNS lookup requests, right?

So a short TTL timeout together with a slow DNS lookup (without nscd) would probably be equally slow the next time, resulting in a CosmHTTPOpen error after a certain number of retries, right? With nscd, eventually one of the DNS lookups succeeds an then the next retry finds it in cache.

Does that make any sense?


Actually, it's not nearly that complicated, if I understand things aright.

Basically, the statically linked version of the 6.29 client (the standard client) is linked against a previous glibc. This was most likely done to avoid the linux equivalent of 'dll hell' where an application depends on a particular version of .dll (or .so in this case), but installing that required version causes other applications that depend on a different version to break. The usual way to avoid this problem is to simply statically link everything. This makes the final binary bigger because it includes code that would normally reside in an external library, but it ensures that all the code needed by the binary is included in the binary already.

Unfortunately, it appears (from what I gather) that glibc has some DYNAMIC dependencies, so even when you statically link the glibc stuff, the stuff glibc depends on is still dynamically loaded, so you still end up in 'dll hell'.

THAT is why CosmHTTPOpen fails: the internal libraries are dynamically loading external libraries that don't match up with the versions of the internal libraries that are being used, and everything pukes. However, the client is written such that it only calls the internal libraries if there isn't an external service to query that can perform the name resolution. For most Fedora systems this is the case, as nscd is not running (nscd would only normally be used in situations where you expect to be doing LOTS of DNS lookups, e.g. you are doing reverse lookups on SMTP or HTTP connections or that sort of thing).

However, if nscd is running CosmHTTPOpen appears to use that instead. Since that communicates using a socket, any issues involving miss-matched library versions get neatly sidestepped because the libraries at issue are never called.

At least, that's my understanding of what's been discovered. I might be completely wack, though. :ewink:

What's important to me at the moment is that I'm folding again. WOOT! :D :lol: 8-) :biggrin: :e)
gizmo
 
Posts: 34
Joined: Mon Sep 21, 2009 1:35 am

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby tear » Mon May 31, 2010 2:02 am

You understand'em pretty much a'ight gizmo :wink:
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby hus » Mon May 31, 2010 6:18 am

Yup. nscd can cache more than host lookups though. Fedora default configuration includes passwd, group and services. "Name service" includes all kinds of stuff, not only "domain name service". If you're curious, have a look at /etc/nsswitch.conf.

MfG, Ulrich
hus
 
Posts: 11
Joined: Mon May 10, 2010 2:28 pm

Re: 6.29 has libc problems (symptom: Could not CosmHTTPOpen)

Postby bruce » Mon May 31, 2010 7:14 am

I understand static/dynamic linking and ".dll hell" avoidance as well as caching. I was just looking for the specific relationship between caching and the specific symptom: Could not CosmHTTPOpen
bruce
 
Posts: 22249
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

PreviousNext

Return to Linux CPU V6 Client

Who is online

Users browsing this forum: No registered users and 1 guest

cron