This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2 05/15] RISC-V: Generic <string.h> Routines
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Wed, 3 Jan 2018 13:46:45 -0200
- Subject: Re: [PATCH v2 05/15] RISC-V: Generic <string.h> Routines
- Authentication-results: sourceware.org; auth=none
- References: <mhng-fcc68f89-fea8-473f-b36f-38c98f74c978@palmer-si-x1c4> <alpine.DEB.2.20.1801010050290.28505@digraph.polyomino.org.uk>
On 31/12/2017 22:52, Joseph Myers wrote:
> On Sat, 23 Dec 2017, Palmer Dabbelt wrote:
>
>>> I would again suggest deferring adding such functions, especially the C
>>> ones, and instead helping with updating / reviewing RTH's optimized
>>> generic string functions posted a while back, and only adding
>>> RISC-V-specific ones if there is some clear reason RISC-V needs something
>>> non-generic for optimal performance.
>>
>> Yes, that makes sense -- I think I forgot last time because they hadn't gone
>> in yet. We'll just use the generic ones for now, I'll add it to my TODO list
>> to make sure they're generating good code for RISC-V.
>
> To be clear, RTH's optimized functions aren't in glibc yet. So, if you
> find they'd be better for RISC-V than the current generic functions, you
> should join in the process of getting them in glibc - for 2.28 not 2.27
> now, of course (but only if generic functions can't be made good for
> RISC-V would RISC-V-specific ones be desirable).
>
I have being working sporadically with RTH's optimized generic string function
and I have pushed my branch on azanella/generic-strings. Based on RTH's initial
proposal I expanded:
- Fixed an issue with strcmp.
- Added an unaligned implementation for strcpy (which should be faster for
architecture that define _STRING_ARCH_unaligned).
- Changed how to check for initial byte in strchr/strlen/memchr by reading
aligned and masking out the undesirable bits (instead of reading byte
per byte). It follows the strategy already used on arch specific
implementation (alpha, powerpc).
- Add SH has_{zero,eq,zero_eq} using arch-specific instructions.
Another thing I would like to check before submitting for 2.28 is a way to
add a generic index_last_/index_fist_ without using __builtin_{clzl,ctzl}.
On some architectures it is implemented by a libgcc call and calling a
function call pretty much defeat the optimizations done (I added a generic
one for SH).
On the patch I haven't noticed any arch-specific instructions meant for
string operations (as for cmpb on powerpc for instance), so I think you
might use the generic implementation for riscv.