This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
On Sat, Feb 17, 2001 at 11:06:32PM +0100, Jakub Jelinek wrote: > Hi! > > The following patch seems to cure ex5, ex9 and ex10 on ia64/SMP. Basically, if > lock->__status had lowest bit set on spin_count 0, it would always spin > until max_count, since lock->__status was cached in a register and never > reloaded. __compare_and_swap clobbers it, but the codepath with > (__status & 1) == 1 skips that, so there is nothing which requires gcc not > to reload register caching lock->__status only inside of the conditionally > executed code. > The patch is attached in two variants, both seem to fix ex5, ex9 and ex10 > (the tests which were previously failing on smp ia64), but the first results > in better code while the second one is perhaps more readable. > The assembly difference is in fact only: > ld8 r15=[r32];; > in first patch changed to > ld8.acq r15=[r32];; > in the second. I don't have ia64 manuals here at home so I cannot check, but > ld8.acq smells like it would do cache-line ping-pong which is the code > exactly trying to avoid (by only doing CAS if normal loads tells it could be > successful). > This may be related to the change we made for __compare_and_swap and __compare_and_swap_with_release_semantics. The instruction may need the acquire semantics. We may need to exam all places around __compare_and_swap. H.J.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |