This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add STATIC_GETENV macro.


On Thu, Nov 14, 2013 at 11:13:58AM +0100, Torvald Riegel wrote:
> On Wed, 2013-11-13 at 22:33 +0100, OndÅej BÃlka wrote:
> > On Wed, Nov 13, 2013 at 04:34:18PM +0100, Torvald Riegel wrote:
> > > I don't think a macro is wise here.  You seem to want to make the macro
> > > safe for use in multi-threaded programs, so this is similar to a
> > > pthread_once except that there can be more than one call to get_env (ie,
> > > the equivalent to the initialization function), but exactly on
> > > assignment.
> > > Maybe a pthread_once_store(pointer, producer) would be better (*if* we
> > > actually want to add such a feature): pointer is like p in your code,
> > > producer is a function that is called when we actually need to
> > > initialize but can be safely called more than once; we exactly-once set
> > > pointer to the value returned by one of the producer calls.
> > > 
> > Yes but this does not need forcing users to link with pthread.
> > 
> > 
> > > Implementing this properly requires more than just a simple CAS (see
> > > pthread_once and my comments below); thus, you'll need to use proper
> > > atomics or whatever the compiler has available, which requires you to
> > > take care of all of that in the context of the client's compilation
> > > environment.
> > > 
> > > On Sat, 2013-11-09 at 11:38 +0100, OndÅej BÃlka wrote:
> > > > +#define STATIC_GETENV(c) \
> > > > +({									      \
> > > > +  static char *__p = (char *) &__p;					      \
> > > > +  char *__new = __p;							      \
> > > > +  if (__p == (char *) &__p)						      \
> > >
> > There should be
> > if (__new == (char *) &__p)
> 
> Even with that change, the load of p needs to be atomic with
> acquire/consume memory order.
>
We should support older gcc that do not support C11 atomics. 

Also when you are concerned about machines that do issue nonatomic reads
then relaxed model is appropriate, acquire/consume could yield unacceptable
performance overhead.
 
> > > This is a data race (at least in C11, which has a concept of threads).
> > > You need to use an atomic access, and it must have acquire memory order.
> > > Consume memory order could also be sufficient, I believe.
> 
> Like it or not, this is a data race according to C11 and C++11.  You
> offer this using a macro, so it becomes part of the client's code.  Data
> race detectors will find this race condition.
> 
This is valid comment, do race detect report also race for relaxed
loads?

When you use a race detector it would also uncover a race in fnmatch and
other libc races.


> > You do not need atomic access for correctness, worst that can happen is
> > performance degradation by repeately recalculating value.
> 
> No.  Please understand the C11/C++11 memory model.  The compiler could
> choose to read p nonatomically, or read it a couple of times and assume
> that the value won't change, speculating that it is always like in the
> second of two accesses.
>

A default pthread_once implementation is following

int
__pthread_once (once_control, init_routine)
     pthread_once_t *once_control;
     void (*init_routine) (void);
{
  /* XXX Depending on whether the LOCK_IN_ONCE_T is defined use a
     global lock variable or one which is part of the pthread_once_t
     object.  */
  if (*once_control == PTHREAD_ONCE_INIT)
    {
      lll_lock (once_lock, LLL_PRIVATE);

      /* XXX This implementation is not complete.  It doesn't take
         cancelation and fork into account.  */
      if (*once_control == PTHREAD_ONCE_INIT)
        {
          init_routine ();

          *once_control = !PTHREAD_ONCE_INIT;
        }

      lll_unlock (once_lock, LLL_PRIVATE);
    }

  return 0;

A type of pthread_once_t is int so compiler could read once_control
nonatomicaly. According to your argument this code is broken so file a
bug report and send a patch to fix this serious issue.
 
> Furthermore, read about weak memory models.  We need acquire/consume
> memory order to ensure that accesses to *p happen after p* has been
> initialized (and likewise for release on the assignment to p).
> 
I know them and 

> If you can show that in this particular case, initialization of *p will
> happen before every thread can use STATIC_GETENV, then you can do
> without acquire/release -- but then you need to add a comment why this
> is the case and still use relaxed memory order atomic accesses (also on
> the CAS to avoid the barrier overhead, the __sync CAS has mostly seq-cst
> memory order).
> 
Using a static initializer is a common pattern used in nptl and we do not 
need comment for obvious cases.


> > Unless compiler decides to inline functions so that it accesses variable
> > in loop which when called first time would cause repeated
> > recomputations. A solution would be mark variable volatile.
> 
> Volatile is not a sufficient replacement for atomic accesses because it
> tells the compiler that for such variables, accesses should happen
> exactly as if the abstract machine would run the program -- but this
> doesn't necessarily mean they should be atomic wrt. accesses in other
> threads.  And you also cannot establish inter-thread happens-before
> relations with just volatiles (for which you need the memory-orders
> and/or the total orders the memory model defines for seq-cst actions).
> 
This relationship is not necessary.

> > > 
> > > > +    {									      \
> > > > +      __new = getenv (c);						      \
> > > > +      char *__val = __sync_val_compare_and_swap (&__p, (char *) &__p, __new); \
> > > 
> > > This needs acquire-release memory order.  Seq-cst would be too much,
> > > anything less than acquire-release won't work on archs with weak memory
> > > models.
> > > 
> > No need to complicate matters with custom order as this path will likely be 
> > visited at most once per program lifetime.
> 
> First, whether something is "unlikely" to go wrong isn't a good
> criterion for glibc code.

This is not about going wrong but about separating a hot code and cold.
Hot code should be optimized for speed but cold code for readability.

>From correctness perspecive this always returns correct value so only
performance is a concern. You evaluate this by risk analysis, in a worst
case we always recompute a getenv from scratch as unoptimized code
always computed getenv there is only additional overhead of atomic
instructions,

To estimate how many times we recompute there is risk that a thread will
be interupted when computing a function by other thread that computes a
function which we can bound by number of threads.
For synchronizing loads a number of different values can be bounded by
number of cores.

Compared to pthread_once it faces same issues but does locking instead.
If a function in question is reentrant and faster that context switch
overhead then using this primitive instead pthread_once is faster.
Also using pthread_once makes function in question not async-signal-safe
while we preserve async-signal safety.


> Second, synchronization/concurrency can indeed be complex and possibly a
> complicated matter depending on your level of experience with it.  But
> that's just more reasons to carefully pay attention to the details, and
> document synchronizing code properly.
> 
You need attention to details on what is performance impact of
concurency changes, so far your suggestion would only decrease
performance.


> > > You really should also document synchronization code thoroughly.  That
> > > we need a CAS for exactly-once might be obvious to many; but I guess
> > > it's not so obvious which memory orders we need (otherwise, I guess you
> > > wouldn't have missed them :).  Therefore, do document which accesses are
> > > supposed to synchronize with each other, and which acquire/release pairs
> > > you need to establish the happens-before relations required for
> > > correctness.
> > > 
> > > > +      __new = (__val ==  (char *) &__p) ? __new : __val;		      \
> > > > +    }									      \
> > > > +  __new;								      \
> > > > +})
> > > > +
> > > 
> > 
> 
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]