This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [RFC][PATCH] PowerPC - e6500 optimized memcmp function


Thanks Zanella, I have added my comments below.

> 
> Hi
> 
> Patch looks good in general, however you need to rebase against master
> (although the differences are minimal). Some comments below:
> 
> On 19-08-2015 09:51, Dharmakan Rohit Arul Raj wrote:
> > Thanks Zanella.
> >
> > Please find below, patch for optimized implementation of 'memcmp' for
> PowerPC e6500 (32-bit & 64-bit) target using Altivec instructions.
> >
> >       * sysdeps/powerpc/bits/hwcap.h: Add macro to identify e6500 target.
> >       * sysdeps/powerpc/powerpc32/e6500/memcmp.S: New File
> >       * sysdeps/powerpc/powerpc32/e6500/multiarch/Implies: New File.
> >       * sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
> > Add
> >       memcmp multiarch implementation.
> 
> You need to add the function you are referring to (__libc_ifunc_impl_list).
Ok.

> 
> >       * sysdeps/powerpc/powerpc32/power4/multiarch/Makefile: Likewise.
> >       * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp.c:
> Likewise.
> >       * sysdeps/powerpc/powerpc32/power4/multiarch/memcmp-e6500.S:
> New File.
> >       * sysdeps/powerpc/powerpc64/e6500/memcmp.S: New File.
> >       * sysdeps/powerpc/powerpc64/e6500/multiarch/Implies: New File
> >       * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c:  Add
> >       memcmp multiarch implementation.
> 
> Same as before.
Ok.
 
> >       * sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
> >       * sysdeps/powerpc/powerpc64/multiarch/memcmp.c: Likewise.
> >       * sysdeps/powerpc/powerpc64/multiarch/memcmp-e6500.S: New File.
> >
> > diff -Naur glibc-2.20/sysdeps/powerpc/bits/hwcap.h
> > glibc-2.20-e6500-mcmp/sysdeps/powerpc/bits/hwcap.h
> > --- glibc-2.20/sysdeps/powerpc/bits/hwcap.h 2014-09-07
> > 03:09:09.000000000 -0500
> > +++ glibc-2.20-e6500-mcmp/sysdeps/powerpc/bits/hwcap.h     2015-08-19
> > +++ 05:48:43.688000596 -0500
> > @@ -64,3 +64,7 @@
> >  #define PPC_FEATURE2_HAS_TAR       0x04000000 /* Target Address
> >Register */
> >  #define PPC_FEATURE2_HAS_VEC_CRYPTO  0x02000000  /* Target
> supports
> >vector
> >                                                 instruction.  */
> > +/* Identify Freescale Processors.  */ #define PPC_FEATURE_E6500
> > +(PPC_FEATURE_64 | PPC_FEATURE_BOOKE | \
> > +                                PPC_FEATURE_HAS_ALTIVEC | \
> > +                                PPC_FEATURE_HAS_FPU |
> > +PPC_FEATURE_HAS_MMU)

<snip>

> > diff -Naur
> > glibc-2.20/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.
> > c
> > glibc-2.20-e6500-
> mcmp/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc
> > -impl-list.c
> > ---
> > glibc-2.20/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.
> > c        2014-09-07 03:09:09.000000000 -0500
> > +++ glibc-2.20-e6500-
> mcmp/sysdeps/powerpc/powerpc32/power4/multiarch/i
> > +++ func-impl-list.c  2015-08-19 05:51:57.389000503 -0500
> > @@ -34,6 +34,7 @@
> >    size_t i = 0;
> >
> >    unsigned long int hwcap = GLRO(dl_hwcap);
> > +  unsigned long int hwcap2 = GLRO(dl_hwcap2);
> >    /* hwcap contains only the latest supported ISA, the code checks
> >which is
> >       and fills the previous supported ones.  */
> >    if (hwcap & PPC_FEATURE_ARCH_2_06)
> > @@ -107,6 +108,10 @@
> >    IFUNC_IMPL (i, name, memcmp,
> >             IFUNC_IMPL_ADD (array, i, memcmp, hwcap &
> >PPC_FEATURE_HAS_VSX,
> >                            __memcmp_power7)
> > +           IFUNC_IMPL_ADD (array, i, memcmp,
> > +                          (((hwcap & PPC_FEATURE_E6500) ==
> > +PPC_FEATURE_E6500)
> > +                          && (hwcap2 & PPC_FEATURE2_HAS_ISEL)),
> > +                          __memcmp_e6500)
> 
> Do you really to check for PPC_FEATURE2_HAS_ISEL? There is no 'isel' usage
> in implementation.

I added that check to differentiate between e6500 and e600 processors, although the check  for ' PPC_FEATURE_64' should be enough to differentiate them.
If the macro 'PPC_FEATURE_E6500' doesn't overlap with any of the other PowerPC processors, I will remove them and re-submit the patch. Please let me know.

<snip>
> 
> I have tested this implementation for powerpc64le (with some modification
> to fix the endian issues) and at least for power7/power8 I only see some
> gains in the 'small' and 'medium' path, less then 64 bytes, which I think for
> power7 memcmp it can be fixed.
> 
> Sometime ago I try to use AVX/VSX instructions on memcmp to get some
> more throughput, but I see to no gain in aligned case neither in unaligned
> one (which was worse due the fact it required a lot of instruction to isolate
> which byte was the different one, like what you do in L(Vwords_Differ)).
> 
> I do not have much knowledge of e6500 core and its contrainst, but did you
> try to power7 memcmp version on it to check if it is worth to use? It is
> basically the power4 version without the branch prediction instructions.

I have rebased the benchtests numbers with glibc v2.20 master and attached the results.
Overall, e6500 numbers does seem to be better.

Please let me know your comments.

Regards,
Rohit

Attachment: benchtest-64bit-e6500-p7-memcmp.txt
Description: benchtest-64bit-e6500-p7-memcmp.txt

Attachment: benchtest-32bit-e6500-p7-memcmp.txt
Description: benchtest-32bit-e6500-p7-memcmp.txt


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]