Core hangs: jbd2_log_wait_commit

Moderators: Site Moderators, PandeGroup

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Sun Nov 29, 2009 5:29 pm

I think they would not. Some core maintainers work pretty close with the community. Others hardly ever. SMP core maintainer belongs to the latter category.
Client is supposedly maintained by someone else but haven't heard from him either.

Also, let me rephrase -- we'd need to know what programmers intended to do. It's very likely things
could be improved not by bumping write() size (little improvement) but by revamping whole approach (big improvement).

For instance, to clear the file (so it's not accidentally reused) one could do the following:
1) Rename (which is an atomic operation within single filesystem) the file
2) Write 4k (or just enough to make contents unparseable) of zeroes
3) Call fsync()
4) Unlink (delete)


tear
One man's ceiling is another man's floor.
Image
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Sun Nov 29, 2009 6:47 pm

"Cross-posting" from viewtopic.php?f=55&t=12248&p=120089#p120078

Can anyone with ext4 (or affected ext3):
a) check status of barrier usage: call (as root) "mount" and see if filesystem
 of interest displays barrier status (barrier=0|1)
b) try turning barriers off and see if it makes any difference?

ad b)
It can be done run-time and requires a remount of a filesystem, e.g.
for ext4 something along the following (as root) should work:
Code: Select all
mount / -o barrier=0,remount

If necessary replace / with filesystem that stores the client (e.g. /home)

This change is *not* permanent so a reboot will void it. Also, running
without barriers may bite you pretty hard in case of hard reset/power outage
-- you've been warned.

Disabling barriers does trick here (with XFS) -- WU write/clear times are back to ext3's values.

The reason ext3 does not degrade is due to I/O barriers disabled by default (also see http://en.wikipedia.org/wiki/Ext3#No_ch ... in_journal).
XFS and ext4 have had barriers enabled for some time now.

Just so we're clear -- I do not recommend disabling barriers.


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby toTOW » Mon Nov 30, 2009 3:17 pm

Don't worry, kasson is aware of this problem, but last time (a few weeks ago) I talked about it and asked for a fix, he said "not in this release" ... but he knows about it, and he might fix it after some more important work that's currently in the pipe.
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8776
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Mon Nov 30, 2009 4:55 pm

Dear toTOW,


I'm not worried. He wanted the chances and chances he got.

I know you mean good (in contrast to some other folks here) but one
sentence heard through the grapevine isn't good enough.

Every persona I encounter gets initial credit. Peter has lost
that credit rather rapidly. I work with professionals on a daily
basis and, by these standards, Peter is very far from being
a professional (and I'm not talking only about this forum).

That's why there's langouste, there's running client off tmpfs,
there's workaround for Dual-Cores and couple other minor bits.


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby toTOW » Mon Nov 30, 2009 11:38 pm

Well, I'm sorry to disappoint you, but he's not a professional programmer ... read his introduction again : http://folding.typepad.com/news/2007/09 ... eam-1.html

I guess if you read his publications, you won't say again that he's not a professional ... :roll:
User avatar
toTOW
Site Moderator
 
Posts: 8776
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Tue Dec 01, 2009 12:19 am

That's not really my expectation (as we all have limitations). Perhaps my paragraph suggested so...

What I'm talking about is professional attitude, not skill set.

Geez, we've done it so many times...

Really, it doesn't take a lot to say (and do):
-- "I don't know but I'll look into that and I'll get back to you" or "I don't know but I'll have somebody look and it and get back to you"
-- "Tell me more about this issue as we haven't observed it in my environment, how can I reproduce it?"
-- "Yes, I can see that too; what is it you suggest we do?"
-- "To check whether your approach is correct, you'd need to do so and so"
-- "We're currently working on so and so but we'll resolve this in next release scheduled for (...)"

Being open minded. Communicative. Respecting other people's time. That's professionalism. Not publications.


I can see a contractor has been hired to do some of the development. That's a step in very good direction, I really appreciate it.
Let's see how it works out. I hope his enthusiasm doesn't get put down.


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby 7im » Tue Dec 01, 2009 5:20 am

No offense tear, but if Pande Group responded to every question, they'd never get any science done.

And while your programming contributions and ideas are a level above most, it also seems like you do some of it in spite of Panda Group. Collaborative attitudes are a two way street.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Tue Dec 01, 2009 8:53 am

See, I never said (or implied) PG should do all of the support; community's been doing great job there.
I'm talking about diagnosed and/or well-understood issues, issues tackling of which goes beyond (constantly improving) community's capabilities.

Don't you think there's something not entirely right when community gets to find, triage and resolve issues (with various kludges) all by itself?

I don't quite understand what you're implying in the last paragraph. Are you suggesting my contributions are unfriendly?


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby toTOW » Tue Dec 01, 2009 11:14 am

Do you know that it's not forbidden to write him a PM with your suggestions and/or contributions ?
User avatar
toTOW
Site Moderator
 
Posts: 8776
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: Core hangs: jbd2_log_wait_commit

Postby 7im » Tue Dec 01, 2009 8:33 pm

tear wrote:See, I never said (or implied) PG should do all of the support; community's been doing great job there.
I'm talking about diagnosed and/or well-understood issues, issues tackling of which goes beyond (constantly improving) community's capabilities.

Don't you think there's something not entirely right when community gets to find, triage and resolve issues (with various kludges) all by itself?

I don't quite understand what you're implying in the last paragraph. Are you suggesting my contributions are unfriendly?


tear


Panda Group priorities are somewhat different from the community. And what you consider a big issue may not be as high on the priority list for them. So when the community offers to find and offer fixes for various things where the software might fall short, all the community, the science, and Stanford benefits. For example, we all benefit from programs like the ones Dick Howell made (QD, QFix, etc.) and like the ones Uncle_Fungus are making (FahMon, and various GUI client improvements). Uncle_Fungus started out much as you did, first as a project participant, then software developer, then Forum Admin, and now a Pande Group member/developer. So no, I don't see a problem where we find and report issues, or even fix them. That is one of the main purposes for this support forum... reporting feedback. We even have a beta team in this forum that does just that sort of thing.

Yes, there is a lot of room for improvement, in the clients, in communications, documentation, etc. I'm not saying that nothing is wrong, but I am saying that what you appear to be calling a serious problem is not such a big problem, but also an opportunity.

And no, unfriendly would not be what I would call it. I'm trying to tread lightly here, and describe it without offending or insulting you personally. But some of the work you do walks very closely (some times too closely, IMO) to the edge of or the line of where your tools could potentially impact the science in a negative and unknown way, in spite of cautions or recommendations otherwise. For example, how well do you know if adding additional SMP settings or commands might impact the client, other that increasing performance? And how do you know that for sure?
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Wed Dec 02, 2009 12:59 am

toTOW wrote:Do you know that it's not forbidden to write him a PM with your suggestions and/or contributions ?

You've got mail, my friend :-)
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Wed Dec 02, 2009 1:09 am

7im wrote:
tear wrote:See, I never said (or implied) PG should do all of the support; community's been doing great job there.
I'm talking about diagnosed and/or well-understood issues, issues tackling of which goes beyond (constantly improving) community's capabilities.

Don't you think there's something not entirely right when community gets to find, triage and resolve issues (with various kludges) all by itself?

I don't quite understand what you're implying in the last paragraph. Are you suggesting my contributions are unfriendly?


tear


Panda Group priorities are somewhat different from the community. And what you consider a big issue may not be as high on the priority list for them. So when the community offers to find and offer fixes for various things where the software might fall short, all the community, the science, and Stanford benefits. For example, we all benefit from programs like the ones Dick Howell made (QD, QFix, etc.) and like the ones Uncle_Fungus are making (FahMon, and various GUI client improvements). Uncle_Fungus started out much as you did, first as a project participant, then software developer, then Forum Admin, and now a Pande Group member/developer. So no, I don't see a problem where we find and report issues, or even fix them. That is one of the main purposes for this support forum... reporting feedback. We even have a beta team in this forum that does just that sort of thing.

Yes, there is a lot of room for improvement, in the clients, in communications, documentation, etc. I'm not saying that nothing is wrong, but I am saying that what you appear to be calling a serious problem is not such a big problem, but also an opportunity.

Too tired to have this discussion with you now (I don't think it's worth my time anyway). Let's stick to technical things.

7im wrote:And no, unfriendly would not be what I would call it. I'm trying to tread lightly here, and describe it without offending or insulting you personally. But some of the work you do walks very closely (some times too closely, IMO) to the edge of or the line of where your tools could potentially impact the science in a negative and unknown way (...)

And you're drawing that conclusion based on your understanding of those issues and resolutions? :D "If I don't know what he's doing there's no way in the world he does." ?

7im wrote:For example, how well do you know if adding additional SMP settings or commands might impact the client, other that increasing performance? And how do you know that for sure?

With the exception of PME/PP thingie I am capable of proving every single improvement I came up with. Are you capable of disproving them with something else than your usual FUD?


tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby 7im » Wed Dec 02, 2009 1:19 am

Yep, based on my very limited understanding of the project and software, and my conclusions. So that should make it easier for you then... ;)

One example proof will suffice, when convenient for you. For simplicity, the one I referenced, if you don't mind. Thanks.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Core hangs: jbd2_log_wait_commit

Postby tear » Wed Dec 02, 2009 1:24 am

Which one specifically is it? Print between the lines was too fine for me, sorry.

tear
tear
 
Posts: 857
Joined: Sun Dec 02, 2007 4:08 am
Location: Rocky Mountains

Re: Core hangs: jbd2_log_wait_commit

Postby 7im » Wed Dec 02, 2009 7:44 am

The easy one... MPICH_NO_LOCAL=1 environment variable
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

PreviousNext

Return to Linux CPU V6 Client

Who is online

Users browsing this forum: No registered users and 1 guest

cron