This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: Suggestion: Dynamic __compare_and_swap() behaviour in SMP/non-SMP machines
- From: george anzinger <george at mvista dot com>
- To: Tal Davidson <tal at givenimaging dot com>
- Cc: libc-alpha at sources dot redhat dot com
- Date: Thu, 15 Aug 2002 10:03:11 -0700
- Subject: Re: Suggestion: Dynamic __compare_and_swap() behaviour in SMP/non-SMP machines
- Organization: Monta Vista Software
- References: <006101c2442e$f8ad3340$5eb68fd4@tal>
Tal Davidson wrote:
>
> Hi all,
>
> I hope I am finally in the correct mailing list for this...
>
> As of linuxthreads version 2.2.5, and as of the current CVS, I
> understand that the "__compare_and_exchange" for Intel x86 processors
> (in linuxthresds/sysdeps/i386/pt-machine.c), used for locking various
> synchronization objects, contains the following assembly, in which a
> "lock;" assembly command is always hard-coded regardless of whether the
> computer is SMP or not.
> __asm__ __volatile__ ( "lock; cmpxchgl %3, %1; sete %0"
> : "=q" (ret), "=m" (*p), "=a" (readval)
> : "r" (newval), "m" (*p), "a" (oldval)
> : "memory");
>
> The "lock;" assembly command is unneeded on single processor (non-SMP)
> workstations, and SEVERELY slows down the behavior of various pthread
> synchronization objects.
>
> On my single Pentium IV processor Linux Redhat 7.3 machine, I compared
> the raising and lowering speeds of various pthreads synchronization
> objects with and without the "lock" assembly command. Using the
> benchmark codes in Dr. Ed Bradford's article on my modified version of
> pt-machine.c, I have found that pthread mutex locking/unlocking and
> semaphore waiting/posting performed continuously on a single thread to
> be running around 3 TIMES FASTER without the "lock;" assembly command in
> the above code as compared to original "__compare_and_exchange"
> implementation. Using the same benchmark on a Windows 2000, I found
> recursive pthread mutexes to now run as fast as Win32 critical sections
> when without the "lock' assembly in the above code.
>
> Two proper methods for removing the "lock" are of course static removal
> (using the pre-compiler), and dynamic removal (using dynamic testing if
> the system running is indeed SMP or not).
>
> Since linuxthreads checks dynamically for a SMP kernel using
> "is_smp_system()" in the pthread initiation code (in the pthread_init
> method), and since this information is then stored in the
> "__pthread_smp_kernel" global variable,
> what are your thoughts on the following patch
> for"linuxthreads/sysdeps/i386/pt-machine.h", in which a dynamic decision
> is made whether to use "lock; cmpxchg" in SMP systems and only "cmpxchg"
> assembly in non-SMP systems?
As an interested but unknowledgeable (on lib issues) party,
I wonder if the test could be used to choose the shared
library to use. One could then have an SMP and a UP version
of the shared library, doing the test once instead of each
call. What I don't know is if it is possible and feasible
to set this up as it requires dynamic linking to the correct
library.
-g
>
> extern int __pthread_smp_kernel;
>
> PT_EI int
> __compare_and_swap (long int *p, long int oldval, long int newval)
> {
> char ret;
> long int readval;
>
> if (__pthread_smp_kernel)
> {
> __asm__ __volatile__ ( "lock; cmpxchgl %3, %1; sete %0"
> : "=q" (ret), "=m" (*p), "=a" (readval)
> : "r" (newval), "m" (*p), "a" (oldval)
> : "memory");
> }
> else
> {
> __asm__ __volatile__ ( "cmpxchgl %3, %1; sete %0"
> : "=q" (ret), "=m" (*p), "=a" (readval)
> : "r" (newval), "m" (*p), "a" (oldval)
> : "memory");
> }
>
> return ret;
> }
>
> Using the same benchmarks, I have found this code with dynamic testing
> of SMP to be only 4% slower than a static test, and as discussed above,
> 300% faster in various catching and releasing of pthreads mutexes and
> semaphores.
>
> What do you think?
>
> Thanks,
> Tal Davidson
>
> ----------------------------Email Disclaimer---------------------------
> The following refers to email messages transmitted from,
> or on behalf of Given Imaging. The information contained
> in this e-mail and it's attached files, including replies
> and forwarded copies, is confidential and intended solely
> for the addressee(s) and may be legally privileged or
> prohibited from disclosure and unauthorized use. If you
> are not the named addressee you may not use, copy or
> disclose this information to any other person. If you
> received this message in error please notify the sender
> immediately and delete all copies of the email and it's
> associated files. If you are not the intended recipient,
> any form of reproduction, dissemination, copying, disclosure,
> modification, distribution and/or publication or any action
> taken or omitted to be taken in reliance upon this message
> or its attachments is prohibited and may be unlawful.
>
> Any views or opinions presented are solely those of the
> originator and do not necessarily represent those of Given Imaging.
> All liability for viruses is excluded to the fullest extent permitted by law.
> --------------------------------EOD------------------------------------
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml