This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: bug in spinlock.c?


On Fri, 21 Feb 2003, Andreas Jaeger wrote:

> Date: Fri, 21 Feb 2003 13:54:13 +0100
> From: Andreas Jaeger <aj at suse dot de>
> To: libc-alpha at sources dot redhat dot com
> Cc: Karsten Keil <kkeil at suse dot de>
> Subject: [libc-alpha] bug in spinlock.c?
> 
> 
> Looking at the ex18 hang (sometimes ex18 even segfaulted) on x86-64,
> Karsten noticed that we allocate a struct wait_node in
> __pthread_alt_lock on the stack - and put it somehow also on the list
> of waiting nodes.
> 
> In __pthread_alt_unlock we go through the waiting nodes and deque it.
> 
> This looks broken, since we allocate something on the stack of a
> function and leave the function with this data hanging around.
> 
> Can somebody confirm this?  Or do you have other ideas that would
> explain the segfaults we noticed?  gdb pointed to this code,

Let me respond to this because I designed this little mousetrap. 

I introduced wait nodes specifically to handle timeouts. The problem
with timed-out locks is that the spontaneous wakeup of the timed-out
operation has no easy way to remove a node from the middle of the list,
and so it must just abandon it there to be ``garbage collected'' later.
But you can't do that with the thread descriptor itself! Solution: use
a dynamically allocated node which points to the thread.

For waits that wake up normally, these nodes can be stack allocated,
so there isn't a memory allocation penalty for code that doesn't call
the timed-out operation.

This is okay, because while the thread suspends, its stack remains
stable. And since it's a non-timeout wait, the thread does not 
wake up spontaneously. The lock owner chooses it, removes it from the
queue, and then wakes it up. So the stack-allocated node is no longer
in the queue by the time the function returns.

If there is some segfault caused by this code, it's some implementation
problem, not a design problem.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]