This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)


On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > cancelable futex calls (used in libpthread code)."
> > 
> > Only FUTEX_WAIT should be cancellable, I suppose.
> > 
> > I've been thinking about whether it would be easier for the
> > implementation of cancellable pthreads functions to just call a
> > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > wait but calling pthread_testcancel instead so that any pending
> > cancellation is acted upon).  However, this seems to save less
> > complexity than just doing for FUTEX_WAIT what we do for all the other
> > cancellable syscalls, so it's probably not worth it.
> 
> This is not viable; it has race conditions, of the exact type which
> the futex syscall and the new cancellation system are designed to
> avoid.

The new scheme would be used, but would always set the cancellation as
pending.  So what I thought about would primarily change the way we act
on cancellation, not when it's considered to have happened.  There's no
real side effect in waiting, so futex_wait is a special case.

> AFAIK it also only works if the cancellation signal is an interrupting
> signal (no SA_RESTART) which is a big problem -- it would introduce
> illegal EINTR results into programs not expecting to receive them and
> not prepared to handle them.

That sounds right.

> > I'd add another thing to the list of steps, though: For future-proofing,
> > it would be good pthread_setcancelstate and pthread_setcanceltype would
> > have an explicit compiler barrier in them.  If at some point we have LTO
> > including those functions, we use C11 atomics, and compilers start
> > optimizing those more aggressively, then I believe we're at risk that
> > the compiler reorders code across the atomics in pthread_setcancelstate;
> > this could lead to cancellation handlers seeing unexpected state when
> > they are actually run.  C11 has tighter rules for what's allowed in
> > signal handlers (IIRC, these are still under discussion, at least in the
> > C++ committee) than what POSIX requires, yet the cancellation handler is
> > similar to a signal handler.  A compiler barrier should hopefully
> > prevent any issues due to that mismatch.  Thoughts?
> 
> Since you're only concerned about signal handlers, not access from
> other threads, volatile is probably sufficient here.

Yes, we don't need HW synchronization between the interrupted code and
the cancellation handler.  Requiring programmers to use volatile
whenever they need to communicate from interrupted code to cancellation
handlers (on both sides!) could work in practice, but only if the
pthread_setcancelstate etc. look to the compiler like they could call
code that uses volatiles -- which isn't the case for these functions
(assuming the compiler understands and optimizes atomics).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]