This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [PATCH] PPC atomic.h add compare_exchange_val forms
- From: "Steve Munroe" <sjmunroe at us dot ibm dot com>
- To: "Carlos O'Donell" <carlos at baldric dot uwo dot ca>
- Cc: "Kevin B. Hendricks" <kevin dot hendricks at sympatico dot ca>, libc-alpha <libc-alpha at sources dot redhat dot com>
- Date: Thu, 17 Apr 2003 11:26:44 -0500
- Subject: Re: [PATCH] PPC atomic.h add compare_exchange_val forms
Carlos O'Donell writes:
>> As stated in my previous response there is a danger of "Live Lock" when
>> multiple locks share a "reservation unit" (for example a cache line).
>
>Do the threads make _any_ progress?
>
>Is it only in the SMP case where the CPU's can try to steal ownership
>of the cacheline from the cache-controller and live-lock?
This requires two or more processors (SMP). This is a rare but annoying
case where the reservation for processor_A is stolen (the reservation
register is reset, usually via a cashe snoop operation) by Processor_B
before processor_A can get to its "store conditional".
The processors remain live locked until an external event (kill -9 or
interval timer) causes one of the task/thread to be rescheduled. We have
no hard data on how often this happens.
>I've struggled with this idea for HPPA since the architecture reference
>explicitly states that only a single lock word is allowed on the
cacheline
>(stride is 64-128 bytes wide depending on the processor).
>
>Padding to cacheline size was attempted, but static locks would have to
>pad to maximum cacheline size. This seemed to be wasteful and problematic
>for backwards binary compatibility, it also gave the linker some
headaches.
>
>Do you just live with the fact that two locks _could_ reside on the same
>cacheline?
So far we (PPC64) are not doing anything in glibc to avoid live lock. This
mostly a performance issue as we scale up (8-32+ way). So far we have
depending on luck but now is a good time to think about this.
The PPC64 kernel implementation does try to isolate spinlocks (one per
cache line). They use:
__attribute__((__aligned__)(SMP_CACHE_BYTES)))
Where SMP_CACHE_BYTES is the constant 128 (so far all PPC64 processor
implementations use 128 byte cache lines, but this could change).
We could try something like this for glibc, but without more hard data I
am an reluctant to start changing things.