This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)


On Fri, Sep 12, 2014 at 06:11:30PM +0200, Torvald Riegel wrote:
> On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> > On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > > cancelable futex calls (used in libpthread code)."
> > > 
> > > Only FUTEX_WAIT should be cancellable, I suppose.
> > > 
> > > I've been thinking about whether it would be easier for the
> > > implementation of cancellable pthreads functions to just call a
> > > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > > wait but calling pthread_testcancel instead so that any pending
> > > cancellation is acted upon).  However, this seems to save less
> > > complexity than just doing for FUTEX_WAIT what we do for all the other
> > > cancellable syscalls, so it's probably not worth it.
> > 
> > This is not viable; it has race conditions, of the exact type which
> > the futex syscall and the new cancellation system are designed to
> > avoid.
> 
> The new scheme would be used, but would always set the cancellation as
> pending.  So what I thought about would primarily change the way we act
> on cancellation, not when it's considered to have happened.  There's no
> real side effect in waiting, so futex_wait is a special case.

The problem is that you end up waiting unboundedly long, possibly
forever. Consider cancellation of a pthread_cond_wait that will never
be signaled, where the cancellation request arrives in the race
window.

> > > I'd add another thing to the list of steps, though: For future-proofing,
> > > it would be good pthread_setcancelstate and pthread_setcanceltype would
> > > have an explicit compiler barrier in them.  If at some point we have LTO
> > > including those functions, we use C11 atomics, and compilers start
> > > optimizing those more aggressively, then I believe we're at risk that
> > > the compiler reorders code across the atomics in pthread_setcancelstate;
> > > this could lead to cancellation handlers seeing unexpected state when
> > > they are actually run.  C11 has tighter rules for what's allowed in
> > > signal handlers (IIRC, these are still under discussion, at least in the
> > > C++ committee) than what POSIX requires, yet the cancellation handler is
> > > similar to a signal handler.  A compiler barrier should hopefully
> > > prevent any issues due to that mismatch.  Thoughts?
> > 
> > Since you're only concerned about signal handlers, not access from
> > other threads, volatile is probably sufficient here.
> 
> Yes, we don't need HW synchronization between the interrupted code and
> the cancellation handler.  Requiring programmers to use volatile
> whenever they need to communicate from interrupted code to cancellation
> handlers (on both sides!) could work in practice, but only if the
> pthread_setcancelstate etc. look to the compiler like they could call
> code that uses volatiles -- which isn't the case for these functions
> (assuming the compiler understands and optimizes atomics).

Oh, I see -- you're concerned about reordering with respect to code
in the calling application. In that case I think you're right -- you
need a full compiler barrier of some sort.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]