This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] tunables: Add IFUNC selection and cache sizes
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 20 Jun 2017 06:28:34 -0700
- Subject: Re: [PATCH] tunables: Add IFUNC selection and cache sizes
- Authentication-results: sourceware.org; auth=none
- References: <20170615131042.GA28885@gmail.com> <63707191-601d-9374-8cad-74f15d51f917@linaro.org>
On Tue, Jun 20, 2017 at 6:23 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
> On 15/06/2017 10:10, H.J. Lu wrote:
>> The current IFUNC selection is based on microbenchmarks in glibc. It
>> should give the best performance for most workloads. But other choices
>> may have better performance for a particular workload or on the hardware
>> which wasn't available at the selection was made. The environment
>> variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
>> to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
>> where the feature name is case-sensitive and has to match the ones in
>> cpu-features.h. It can be used by glibc developers to override the
>> IFUNC selection to tune for a new processor or improve performance for
>> a particular workload. It isn't intended for normal end users.
>>
>> NOTE: the IFUNC selection may change over time. Please check all
>> multiarch implementations when experimenting.
>>
>> Also, GLIBC_TUNABLES=glibc.tune.non_temporal_threshold=NUMBER is
>> provided to set threshold to use non temporal store to NUMBER,
>> GLIBC_TUNABLES=glibc.tune.data_cache_size=NUMBER to set data cache size,
>> GLIBC_TUNABLES=glibc.tune.shared_cache_size=NUMBER to set shared cache
>> size.
>>
>> Any comments?
>>
>> H.J.
>> ---
>> 2017-06-15 H.J. Lu <hongjiu.lu@intel.com>
>> Erich Elsen <eriche@google.com>
>>
>> * elf/dl-tunables.list (tune): Add ifunc, non_temporal_threshold,
>> data_cache_size and shared_cache_size.
>> * manual/tunables.texi: Document glibc.tune.ifunc,
>> glibc.tune.data_cache_size, glibc.tune.shared_cache_size and
>> glibc.tune.non_temporal_threshold.
>> * sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
>> * sysdeps/x86/cpu-tunables.c: Likewise.
>> * sysdeps/x86/cacheinfo.c
>> (init_cacheinfo): Check and get data cache size, shared cache
>> size and non temporal threshold from cpu_features.
>> * sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
>> New.
>> [HAVE_TUNABLES] Include <unistd.h>.
>> [HAVE_TUNABLES] Include <elf/dl-tunables.h>.
>> [HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
>> [HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
>> IFUNC selection, data cache size, shared cache size and non
>> temporal threshold.
>> * sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
>> shared_cache_size and non_temporal_threshold.
>> ---
>> elf/dl-tunables.list | 16 ++
>> manual/tunables.texi | 36 ++++
>> sysdeps/unix/sysv/linux/x86/dl-sysdep.c | 21 ++
>> sysdeps/x86/cacheinfo.c | 10 +-
>> sysdeps/x86/cpu-features.c | 19 ++
>> sysdeps/x86/cpu-features.h | 8 +
>> sysdeps/x86/cpu-tunables.c | 330 ++++++++++++++++++++++++++++++++
>> 7 files changed, 439 insertions(+), 1 deletion(-)
>> create mode 100644 sysdeps/unix/sysv/linux/x86/dl-sysdep.c
>> create mode 100644 sysdeps/x86/cpu-tunables.c
>>
>> diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list
>> index 41ce9af..78354fb 100644
>> --- a/elf/dl-tunables.list
>> +++ b/elf/dl-tunables.list
>> @@ -82,6 +82,22 @@ glibc {
>> type: UINT_64
>> env_alias: LD_HWCAP_MASK
>> default: HWCAP_IMPORTANT
>> + }
>> + ifunc {
>> + type: STRING
>> + security_level: SXID_IGNORE
>> + }
>> + non_temporal_threshold {
>> + type: SIZE_T
>> + security_level: SXID_IGNORE
>> + }
>> + data_cache_size {
>> + type: SIZE_T
>> + security_level: SXID_IGNORE
>> + }
>> + shared_cache_size {
>> + type: SIZE_T
>> + security_level: SXID_IGNORE
>> }
>
> Is it possible with current tunables approach to make it arch-specific? The
> 'ifunc' switch seems a generic one, but the 'non_temporal_threshold',
> 'data_cache_size', and 'shared_cache_size' are very x86 specific and I see
> it confusing exposing them to non-x86 architectures.
Yes, it can be made x86 specific. I will update my patch shortly.
Thanks.
H.J.