This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS
- From: Alexandre Oliva <aoliva at redhat dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Andreas Schwab <schwab at linux-m68k dot org>, libc-alpha at sourceware dot org
- Date: Fri, 05 Jun 2015 01:39:15 -0300
- Subject: Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS
- Authentication-results: sourceware.org; auth=none
- References: <orvbf5ffyt dot fsf at livre dot home> <1433326788 dot 21461 dot 81 dot camel at triegel dot csb> <ora8whexn9 dot fsf at livre dot home> <1433344426 dot 21461 dot 202 dot camel at triegel dot csb> <orwpzkee1h dot fsf at livre dot home> <1433419889 dot 21461 dot 294 dot camel at triegel dot csb>
On Jun 4, 2015, Torvald Riegel <triegel@redhat.com> wrote:
> Applied to a double-checked locking pattern, this means that all data
> accessed outside the critical section, and is also checked and modified
> inside the critical section, must use atomic accesses.
Is the l_tls_offset field the data you're talking about? We've already
determined that there is a happens-before for everything else, and my
understanding is that, for l_tls_offset alone, being the double-checked
lock key value, and the only one that matters for the uses that isn't
necessarily covered by the happens-before relationship we've already
established, we have no need for atomics there.
> OK. So we're dealing with inter-thread concurrency here.
Yes.
>> Not really. It is a preexisting issue, yes, but an acquire load would
>> make sure the (re)initialization of the memory into a link map,
>> performed while holding the lock (and with an atomic write, no less),
>> would necessarily be observed by the atomic acquire load. A relaxed
>> load might still observe the un(re)initialized value. Right?
> I can't follow you here.
> One thing to note is that acquire loads synchronize with release stores
> (or stronger) on the *same* memory location. An acquire load does not
> synchronize with operations on a lock, unless the acquire load peeks
> into the lock and does an acquire load on the lock's state or such.
> Therefore, when you think about which effect an acquire load has,
> consider which release store you are actually thinking about. An
> acquire operation does not have an effect on it's own, only in
> combination with other effects in the program. This is also why we want
> to document which release store an acquire load is supposed to
> synchronize with.
> Thus, which release store are you thinking about in this case?
Nothing but l_tls_offset.
>> Now, in order for any such access to take place, some relocation applied
>> by A must be seen by the observing thread, and if there isn't some
>> sequencing event that ensures the dlopen (or initial load) enclosing A
>> happens-before the use of the relocation, the whole thing is undefined;
>> otherwise, this sequencing event ought to be enough of a memory barrier
>> to guarantee the whole thing works. It's just that the sequencing event
>> is not provided by the TLS machinery itself, but rather by the user, in
>> sequencing events after the dlopen, by the init code, in sequencing the
>> initial loading and relocation before any application code execution, or
>> by the thread library, sequencing any thread started by module
>> initializers after their relocation.
> If that's the situation in the static case
The paragraph quoted above applies to both cases.
>> You're missing the other cases elsewhere that set this same field.
> What do you mean? How is it any better if you don't fix it properly in
> the functions you have looked at and modified, just because there are
> more problems elsewhere?
If you say atomics are only correct if all loads and stores, including
those guarded by locks, are atomic, then adding atomics to only some of
them makes things wrong.
>> Is a double-check lock regarded as a race? I didn't think so.
> *Correct* double-checked locking isn't.
> "set l_tls_offset;" in the storing thread can be concurrent with the
> first access of "l_tls_offset" (or the ones in the branches after the
> critical section).
It appears to follow from your statement and example above that
*correct* double-checked locking can only be attained using atomics. Is
that so? If not, can you give an example of correct double-checked
locking that doesn't use them, and explain why that's different from
what's in my revised patch?
> If you disagree with the rule,
I don't. Maybe one of us misunderstands it, or we're otherwise failing
to communicate, but I'm honestly trying to avoid data races. I just
don't know that the unguarded reads in double-checked locks qualify as
data races.
--
Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/ FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer