This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, AArch64] Optimized strcpy


> Following the various discussions about the above, I've done some
> further tweaking of the code and indeed there some further performance
> improvements, particularly for short strings.
>
> I think this is likely to be the final version (at least, for 2.21).
>
> Changes this time around:
>
> - Add the ability to build the code as stpcpy().
>
> - Small change to the page crossing check, that uses the same number of
> instructions, but could be faster on some micro-architectures.
>
> - For the slow (page crossing) check, once a page cross is known to
> occur, jump to the normal entry point.
>
> - For big-endian only, on the first check we pre-reverse the bytes so
> that we don't have to recalculate the syndrome in the (likely) case that
> the string is short.
>
> - For the initial unaligned fetch, detect zeros in the first and second
> DWords independently and jump to the relevant epilogue sequence
> directly.  This eliminates another level of branching later on for the
> special cases when we have to use sub-dword sized stores
>
> - Other changes are mostly re-ordering of the hunks of code and
> micro-optimizations that fall out of the above changes.
>
> OK?
>
>         * sysdeps/aarch64/strcpy.S: New file.
>         * sysdeps/aarch64/stpcpy.S: New file.


OK, can you also add a NEWs entry? /Thanks


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]