This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Lock elision implementation guidelines


On Fri, Feb 15, 2013 at 12:46:31PM +0100, Torvald Riegel wrote:
> On Tue, 2013-02-12 at 19:12 -0500, Rich Felker wrote:
> > On Sun, Feb 10, 2013 at 09:20:52PM +0100, Torvald Riegel wrote:
> > > --- Compability of lock()/unlock() with pthreads mutex types
> > > 
> > > This scheme allows a thread to acquire a lock that it has already
> > > acquired, and there is no error checking when releasing a lock.  Thus,
> > > for which pthreads mutex types is it a suitable implementation?:
> > > 
> > > PTHREAD_MUTEX_DEFAULT: Yes.  Recursive locking or releasing an unlocked
> > > mutex result in undefined behavior, so the implementation is free to do
> > > anything in such cases.
> > > 
> > > PTHREAD_MUTEX_NORMAL: Strictly speaking, no.  Allows undefined behavior
> > > on releasing an unlocked mutex, but requires deadlock when trying to
> > > lock an acquired lock.  Deadlock is not the same as undefined behavior:
> > > even though there's no progress on deadlock, nothing bad will happen
> > > either.
> > > ??? PTHREAD_MUTEX_NORMAL is the same type as PTHREAD_MUTEX_DEFAULT in
> > > current glibc.  This would mean that we'd have to assume the program
> > > wanted NORMAL, so we couldn't use elision for the default mutex type.
> > > However, it's probably rare for correct programs to rely on the deadlock
> > > guarantee in a meaningful way -- should we ignore the deadlock
> > > requirement?
> > 
> > No. You cannot ignore any requirements. That's why they're called
> > requirements and not recommendations.
> 
> Usually, I wouldn't reconsider any requirements.  But this case is
> special enough to warrant thinking about this.

Everybody's pet feature is "special enough" to warrant breaking the
requirements -- to them. The problem is that this doesn't scale. You
end up breaking all the requirements because everybody wants to break
a different one.

> First, the real-world use cases that are enabled by guaranteeing
> deadlock instead of undefined behavior are limited:
> - The mutexes aren't required to be cancellation points.  Only rwlock
> might contain cancellation points.  Thus, AFAICT, there is no way to
> recover from a deadlock using POSIX facilities.
> - One can't just kill threads in other ways and still expect the program
> to continue to work.
> - To find out whether there was a deadlock, the program would have to
> explicitly track which mutexes were acquired and are about to be
> acquired.  POSIX doesn't let you query whether a mutex is part of a
> deadlock.  If the program is tracking this anyway, why would it not use
> lock acquisitions with timeouts to prevent any deadlocks right away?
> - If we can't recover from a deadlock, then we can't make progress after
> a deadlock.  Making no progress is a fault in the program.  Thus, the
> deadlock requirement doesn't enable us to fix a faulty program.

Actually it's not. There's no reason to consider a deadlock to be a
fault in a program. I'll admit it's usually a bug for a thread not to
exit, but it's a bug with well-defined behavior; the only thing
harmful it does is waste resources. If the number of threads that
deadlock in a program is bounded, it's not even a resource "leak".
This is a very different class of "bug" from what would happen if the
thread wrongfully continued execution when it should have deadlocked!
The latter would almost certainly lead to undefined behavior, with
likely results including data corruption or security compromise.

With that said, I can think of at least several scenarios where the
deadlock behavior could be used constructively:

1. For a permanent thread that does nothing but handle signals using
async-signal-safe functions, pthread_mutex_lock(&m);
pthread_mutex_lock(&m); is just as valid as for(;;)pause();

2. Certain operations that initially construct data structures on a
temporary thread's stack and then move them to allocated memory may
need to preserve the original structures on the thread's stack in the
even of memory exhaustion while the rest of the program shuts down
safely. Self-deadlock would be one obvious way to make a thread
preserve the lifetime of all its presantly extant automatic objects
until program exit.

Regardless of whether you consider these good or bad programming
practice, they are legal practice in a conforming POSIX program and
thus glibc has no business breaking such requirements.

> > I would track with a single TLS pointer to the mutex locked. If more
> > than one mutex is being held at the same time, this is almost surely
> > an indication that elision should not be used.
> 
> No, I disagree, this isn't such an indication.  Why would it be?

I'm going on the principle that elision makes sense for "light"
locking situations, but I agree with your analysis that some
situations (like hand over hand) that involve multiple locks would
benefit from elision.

> > > 2) trylock() on non-recursive mutexes aborts transactions/elision
> > > - Con: bad for custom spinlocks in programs built on top of trylock()
> > 
> > This is an idiotic idiom and not worth making design decisions based
> > on it.
> 
> Sorry, but this statement is unclear.  Do you dislike trylock()-based
> spinlocks or spin-then-block locks?

Basically, yes.

> Those can be useful, especially
> given that our current adaptive mutexes aren't really performing that
> well.

Then this is an argument that the adaptive spinning should be fixed
rather than having applications roll their own spin hacks for
performance.

More later..

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]