This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS


On Fri, 2015-06-05 at 01:39 -0300, Alexandre Oliva wrote:
> On Jun  4, 2015, Torvald Riegel <triegel@redhat.com> wrote:
> 
> > Applied to a double-checked locking pattern, this means that all data
> > accessed outside the critical section, and is also checked and modified
> > inside the critical section, must use atomic accesses.
> 
> Is the l_tls_offset field the data you're talking about?

Yes, in this case.

(There may be other data that is initialized once inside the critical
section and then accessed outside.  Accesses to those data can use
nonatomic accesses if data like l_tls_offset makes them data
data-race-free.)

> We've already
> determined that there is a happens-before for everything else, and my
> understanding is that, for l_tls_offset alone, being the double-checked
> lock key value, and the only one that matters for the uses that isn't
> necessarily covered by the happens-before relationship we've already
> established, we have no need for atomics there.

No, that's not true.  If you use a flag like l_tls_offset to indicate
whether some intitialization or other job has been performed already,
and try to apply double-checked locking, this has to be like this
(flag==true means init done, flag initially false):

if (atomic_load_acquire(&flag) != true)
  {
     lock();
     init(&data);
     atomic_store_release(&flag, true);
     unlock();
  }
// data ready to use here
use(&data);

This is double-checked locking.  For an alternative version that
combines the lock with the flag, look at how pthread_once does it.  The
atomic accesses for flag are necessary.  If you wouldn't actually have
concurrent accesses to flag, you're not doing double-checked locking and
can avoid the lock altogether unless you need it for some other
orthogonal reason.

If you're code deviates from this, you can't simply say it's
double-checked locking, but need to comment why the deviation is
correct.

> > OK.  So we're dealing with inter-thread concurrency here.
> 
> Yes.
> 
> >> Not really.  It is a preexisting issue, yes, but an acquire load would
> >> make sure the (re)initialization of the memory into a link map,
> >> performed while holding the lock (and with an atomic write, no less),
> >> would necessarily be observed by the atomic acquire load.  A relaxed
> >> load might still observe the un(re)initialized value.  Right?
> 
> > I can't follow you here.
> 
> > One thing to note is that acquire loads synchronize with release stores
> > (or stronger) on the *same* memory location.  An acquire load does not
> > synchronize with operations on a lock, unless the acquire load peeks
> > into the lock and does an acquire load on the lock's state or such.
> 
> > Therefore, when you think about which effect an acquire load has,
> > consider which release store you are actually thinking about.  An
> > acquire operation does not have an effect on it's own, only in
> > combination with other effects in the program.  This is also why we want
> > to document which release store an acquire load is supposed to
> > synchronize with.
> 
> > Thus, which release store are you thinking about in this case?
> 
> Nothing but l_tls_offset.

I haven't see a release-MO atomic store to l_tls_offset yet.  Nor a
release fence.  Which one do you mean?

> 
> >> Now, in order for any such access to take place, some relocation applied
> >> by A must be seen by the observing thread, and if there isn't some
> >> sequencing event that ensures the dlopen (or initial load) enclosing A
> >> happens-before the use of the relocation, the whole thing is undefined;
> >> otherwise, this sequencing event ought to be enough of a memory barrier
> >> to guarantee the whole thing works.  It's just that the sequencing event
> >> is not provided by the TLS machinery itself, but rather by the user, in
> >> sequencing events after the dlopen, by the init code, in sequencing the
> >> initial loading and relocation before any application code execution, or
> >> by the thread library, sequencing any thread started by module
> >> initializers after their relocation.
> 
> > If that's the situation in the static case
> 
> The paragraph quoted above applies to both cases.

Huh?  Why would you then need the double-checked locking at all?  Unless
I misunderstand you, what you seem to be saying isn't consistent.

> >> You're missing the other cases elsewhere that set this same field.
> 
> > What do you mean?  How is it any better if you don't fix it properly in
> > the functions you have looked at and modified, just because there are
> > more problems elsewhere?
> 
> If you say atomics are only correct if all loads and stores, including
> those guarded by locks, are atomic, then adding atomics to only some of
> them makes things wrong.

Yes, that's the case, strictly speaking.  But you didn't want to tackle
a full conversion of every access to l_tls_offset anywhere --
understandably.  Therefore, I didn't request that.
But that doesn't mean you can ignore what's in the immediate
neighborhood of your change.

> >> Is a double-check lock regarded as a race?  I didn't think so.
> 
> > *Correct* double-checked locking isn't.
> 
> > "set l_tls_offset;" in the storing thread can be concurrent with the
> > first access of "l_tls_offset" (or the ones in the branches after the
> > critical section).
> 
> It appears to follow from your statement and example above that
> *correct* double-checked locking can only be attained using atomics.  Is
> that so?

See above.  Note that init(data) and use(data) can use nonatomics to
access data, but just because because of the correct double-checked
locking synchronization and atomic store-release / load-acquire accesses
on the flag.

> If not, can you give an example of correct double-checked
> locking that doesn't use them, and explain why that's different from
> what's in my revised patch?
> 
> > If you disagree with the rule,
> 
> I don't.  Maybe one of us misunderstands it, or we're otherwise failing
> to communicate, but I'm honestly trying to avoid data races.  I just
> don't know that the unguarded reads in double-checked locks qualify as
> data races.

If the double-checked locking example above nor the comments in
pthread_once don't make it clear, please get in touch and I'll try to
explain.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]