This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] String optimization workflow for architectures.


Also forget to mention different implementations of builtins, these also
need to be selected by benchmarking so we are with same situation as
tunable with multiple values.

Here architecture maintainer could supply custom builtin but it may be
available only for some architectures and there could be several
alternatives or instruction may be too slow. Also there are several
altenative ways to implement generic builtins.

So there should be some system to test these.

I would like to keep system that I use, for each builtin we would make a
directory sysdeps/generic/builtin where each file contains implementation.
Arch maintainer would make builtin directory in his sysdeps.

Then we would first run benchmark that enumerates files
sysdeps/generic/builtin and sysdeps/arch/builtin and creates symlink to
builtin that should be used.



As example question for primitives now I dont know if broadcasting byte
is faster done by:

x * 0x0101010101010101

or

x |= x << 8
x |= x << 16
x |= x << 32

also for first_nonzero byte there are questions like how fast is clz,
and how exploit that you use only highest bits in bytes, and these could
wary based on cpu.


Comments?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]