This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Lock elision implementation guidelines


On Fri, 2013-02-15 at 18:39 -0500, Rich Felker wrote:
> On Fri, Feb 15, 2013 at 12:46:31PM +0100, Torvald Riegel wrote:
> > On Tue, 2013-02-12 at 19:12 -0500, Rich Felker wrote:
> > > On Sun, Feb 10, 2013 at 09:20:52PM +0100, Torvald Riegel wrote:
> > > > --- Compability of lock()/unlock() with pthreads mutex types
> > > > 
> > > > This scheme allows a thread to acquire a lock that it has already
> > > > acquired, and there is no error checking when releasing a lock.  Thus,
> > > > for which pthreads mutex types is it a suitable implementation?:
> > > > 
> > > > PTHREAD_MUTEX_DEFAULT: Yes.  Recursive locking or releasing an unlocked
> > > > mutex result in undefined behavior, so the implementation is free to do
> > > > anything in such cases.
> > > > 
> > > > PTHREAD_MUTEX_NORMAL: Strictly speaking, no.  Allows undefined behavior
> > > > on releasing an unlocked mutex, but requires deadlock when trying to
> > > > lock an acquired lock.  Deadlock is not the same as undefined behavior:
> > > > even though there's no progress on deadlock, nothing bad will happen
> > > > either.
> > > > ??? PTHREAD_MUTEX_NORMAL is the same type as PTHREAD_MUTEX_DEFAULT in
> > > > current glibc.  This would mean that we'd have to assume the program
> > > > wanted NORMAL, so we couldn't use elision for the default mutex type.
> > > > However, it's probably rare for correct programs to rely on the deadlock
> > > > guarantee in a meaningful way -- should we ignore the deadlock
> > > > requirement?
> > > 
> > > No. You cannot ignore any requirements. That's why they're called
> > > requirements and not recommendations.
> > 
> > Usually, I wouldn't reconsider any requirements.  But this case is
> > special enough to warrant thinking about this.
> 
> Everybody's pet feature is "special enough" to warrant breaking the
> requirements -- to them. The problem is that this doesn't scale. You
> end up breaking all the requirements because everybody wants to break
> a different one.

Just a quick comment because this seems to be off-topic: The
requirements themselves don't come out of thin air either;  Some
features are considered to be more important than others when deciding
on the requirements too.  So there is a way to scale such decisions, and
lock elision has enough potential to be quite a bit more than a pet
feature.

> > First, the real-world use cases that are enabled by guaranteeing
> > deadlock instead of undefined behavior are limited:
> > - The mutexes aren't required to be cancellation points.  Only rwlock
> > might contain cancellation points.  Thus, AFAICT, there is no way to
> > recover from a deadlock using POSIX facilities.
> > - One can't just kill threads in other ways and still expect the program
> > to continue to work.
> > - To find out whether there was a deadlock, the program would have to
> > explicitly track which mutexes were acquired and are about to be
> > acquired.  POSIX doesn't let you query whether a mutex is part of a
> > deadlock.  If the program is tracking this anyway, why would it not use
> > lock acquisitions with timeouts to prevent any deadlocks right away?
> > - If we can't recover from a deadlock, then we can't make progress after
> > a deadlock.  Making no progress is a fault in the program.  Thus, the
> > deadlock requirement doesn't enable us to fix a faulty program.
> 
> Actually it's not. There's no reason to consider a deadlock to be a
> fault in a program. I'll admit it's usually a bug for a thread not to
> exit, but it's a bug with well-defined behavior; the only thing
> harmful it does is waste resources.

I'm aware of -- and agree --- that no liveness is not the same as
validating a safety guarantee.  That's why I said that the benefits are
limited.

> If the number of threads that
> deadlock in a program is bounded, it's not even a resource "leak".
> This is a very different class of "bug" from what would happen if the
> thread wrongfully continued execution when it should have deadlocked!
> The latter would almost certainly lead to undefined behavior, with
> likely results including data corruption or security compromise.
> 
> With that said, I can think of at least several scenarios where the
> deadlock behavior could be used constructively:
> 
> 1. For a permanent thread that does nothing but handle signals using
> async-signal-safe functions, pthread_mutex_lock(&m);
> pthread_mutex_lock(&m); is just as valid as for(;;)pause();

That's a good counter example -- I hadn't thought about signal handling.

However, is a deadlocked thread actually *required* by POSIX to make
progress handling signals?

The same question also applies to program exit: Is a program with a
deadlock required to exit successfully, or can it deadlock?  I'm
wondering about this not because I would think that implementations
should try to deadlock in this case; instead, I'm thinking about whether
POSIX actually gives sufficient forward progress guarantees in such
cases to allow the program to depend on them in the scenarios you
mentioned.

> 2. Certain operations that initially construct data structures on a
> temporary thread's stack and then move them to allocated memory may
> need to preserve the original structures on the thread's stack in the
> even of memory exhaustion while the rest of the program shuts down
> safely. Self-deadlock would be one obvious way to make a thread
> preserve the lifetime of all its presantly extant automatic objects
> until program exit.
> 
> Regardless of whether you consider these good or bad programming
> practice, they are legal practice in a conforming POSIX program and
> thus glibc has no business breaking such requirements.

As I said previously, for the PNT use case I agree, and your examples
are a further reason to preserve the deadlock requirement also in the PT
case. 

Nonetheless, for the E use case I think it could still be useful (i.e.,
allowing users to override the deadlock requirement to get more
elision).  But I'm not sure whether providing this is really necessary,
as it would allow experimentation for programs that can't be recompiled.

Andi: do you see any reasons why this would be really helpful for
experimentation (i.e., helpful enough to justify carrying this "hack"
around in glibc)?

Rich: Thanks for the counter examples.  This is useful material for the
wiki page (or whatever we use in the end) that will eventually describe
the lock elision implementation in glibc.

> > > I would track with a single TLS pointer to the mutex locked. If more
> > > than one mutex is being held at the same time, this is almost surely
> > > an indication that elision should not be used.
> > 
> > No, I disagree, this isn't such an indication.  Why would it be?
> 
> I'm going on the principle that elision makes sense for "light"
> locking situations, but I agree with your analysis that some
> situations (like hand over hand) that involve multiple locks would
> benefit from elision.
> 
> > > > 2) trylock() on non-recursive mutexes aborts transactions/elision
> > > > - Con: bad for custom spinlocks in programs built on top of trylock()
> > > 
> > > This is an idiotic idiom and not worth making design decisions based
> > > on it.
> > 
> > Sorry, but this statement is unclear.  Do you dislike trylock()-based
> > spinlocks or spin-then-block locks?
> 
> Basically, yes.

Well, I still can't know what you mean: I ask about A or B, and you
answer "yes" :)  Do you mean both, or the former, or the latter?

> > Those can be useful, especially
> > given that our current adaptive mutexes aren't really performing that
> > well.
> 
> Then this is an argument that the adaptive spinning should be fixed
> rather than having applications roll their own spin hacks for
> performance.

Yes, this motivates improving adaptive spinning, but it does not argue
that doing so avoids the need for custom spinning.  For example, in the
adaptive mutexes, we have to find a middle ground and something that
works well in most cases and doesn't lead to any performance
pathologies.  We can't reliably detect what the program will do in the
critical sections, so adaptive mutexes will always have to make a guess.

The program does know what its critical sections will do, so it can be
better at deciding between spinning or blocking.  For example, consider
one static critical section being used for both short and long
operations, controlled by some flag or whatever: The program knows about
the flag, but the adaptive mutex doesn't and thus will have to guess
whether the concurrent critical sections will likely be short or not.

So, while I agree that we want to avoid making people write their own
lock implementations, I don't see anything wrong with them actually
doing that in cases where it's beneficial for them and they want to do
better than the one-size-fits-all automatic tuning.


Torvald



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]