This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 6/7] stdlib: Optimization qsort{_r} swap implementation



On 22/01/2018 15:15, Paul Eggert wrote:
> On 01/22/2018 02:55 AM, Adhemerval Zanella wrote:
>> On 22/01/2018 06:27, Paul Eggert wrote:
>>> Adhemerval Zanella wrote:
>>>> +static inline bool
>>>> +check_alignment (const void *base, size_t align)
>>>> +{
>>>> +  return _STRING_ARCH_unaligned || ((uintptr_t)base % (align - 1)) == 0;
>>>> +}
>>> Surely the '(align - 1)' was supposed to be 'align'. Has this been tested on an architecture that does not allow unaligned access?
>> Yes, I checked on sparc64 machine.  This test is similar to the Linux kernel one
>> at lib/sort.c.
> 
> The Linux kernel lib/sort.c test is (((unsigned long)base & (align - 1)) == 0), which is correct. The test above uses '% (align - 1)' instead, which is clearly wrong; the "%" should be "&". As the tests evidently did not catch the error, they need to be improved to catch it.

Indeed, Alexander Monakov pointed out this issue is his message and I have fixed it
locally.

> 
>> It might be that your approach is faster for other architectures which do not have
>> ifunc mempcpy, however I do not want to over-engineer this code since most real
>> word correspond to key sizes of 4 and 8.
> 
> Thanks, that all makes sense.
> 
> One other question. Would it improve performance to partially evaluate qsort for the case where the key size is that of a pointer, to allow the swap to be done inline with four insns? I would imagine that this is the most common case.
> 

I noted that, at least for x86_64, calling a function pointer it slight faster
than embedded the test in a switch (as current msort).  One option I have not
tested, and which will trade code side for performance; would parametrize
the qsort creation (as for the 7/7 patch in this set) to have qsort_uint32_t,
qsort_uint64_t, and qsort_generic for instance (which calls the swap inline).

So we will have something as:

void qsort (void *pbase, size_t total_elems, size_t size)
{
  if (size == sizeof (uint32_t)
    && check_alignment (base, sizeof (uint32_t)))
    return qsort_uint32_t (pbase, total_elems, size);
  else if (size == sizeof (uint64_t)
    && check_alignment (base, sizeof (uint64_t)))
    return qsort_uint64_t (pbase, total_elems, size);
  return qsort_generic (pbase, total_elems, size);
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]