This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
RE: Help, any one ever meet hanging on _IO_lock_lock(list_all_lock) issue ?
- From: Wuqixuan <wuqixuan at huawei dot com>
- To: "Carlos O'Donell" <carlos at systemhalted dot org>
- Cc: "libc-help at sourceware dot org" <libc-help at sourceware dot org>, "schwab at redhat dot com" <schwab at redhat dot com>
- Date: Wed, 13 Nov 2013 06:29:29 +0000
- Subject: RE: Help, any one ever meet hanging on _IO_lock_lock(list_all_lock) issue ?
- Authentication-results: sourceware.org; auth=none
- References: <BB7C62C2B0732E4DA93834A501E846456C8D8003 at szxema505-mbx dot china dot huawei dot com>,<CAE2sS1ishHhT+LEqHkcadXyP4wBeWJFGRMroLmVQGrMEBMD9tg at mail dot gmail dot com>,<BB7C62C2B0732E4DA93834A501E846456C8D8023 at szxema505-mbx dot china dot huawei dot com>
> That is odd. However it could be the result of an unbalanced set of
> locks and unlocks. That could result in the problem you're seeing.
> The IO lock can be taken recursively incrementing cnt, and
> decrementing cnt on unlock.
> Once it decrements to 0 the lock is unlocked.
> If something corrupted the cnt value then it will not unlock.
> e.g.
> #define _IO_lock_unlock(_name) \
> do { \
> if (--(_name).cnt == 0) \
> { \
> (_name).owner = NULL; \
> lll_unlock ((_name).lock, LLL_PRIVATE); \
> } \
> } while (0)
> See the `cnt == 0' won't be true and it won't unlock or clear the
> owner, and this thread will continue to do something else.
> The lock will be leaked at that point.
> Is it alive? Dead? Backtrace?
Because the issue happened in my side only once, but cannot be reproduced. Now the env is not there.
Yes, if cnt value is corrupted, nobody can use this lock anymore. But do you know in our case how the cnt value is corrupted and how to reproduced ? I guess there is some other bug to cause the unbalance set of locks and unlocks in glibc 2.4. Do you know what's that?
We found http://sourceware.org/git/?p=glibc.git;a=commit;h=7583a88d1c7170caad26966bcea8bfc2c92093ba which is fixed by schwab.
The patch seems telling flush_cleanup has bug and possibility to corrupt cnt. Do you know prevously what was the exact issue when we want to fix it?
Thanks lot & Regards.
Wuqixuan.