This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Tue, 09 Jun 2015 11:22:48 -0300
- Subject: Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- Authentication-results: sourceware.org; auth=none
- References: <55760314 dot 6070601 at linux dot vnet dot ibm dot com>
Hi
On 08-06-2015 18:03, Carlos Eduardo Seo wrote:
>
> The proposed patch adds a new feature for powerpc. In order to get faster access to the HWCAP/HWCAP2 bits, we now store them in the TCB. This enables users to write versioned code based on the HWCAP bits without going through the overhead of reading them from the auxiliary vector.
>
> A new API is published in ppc.h for get/set the bits in the aforementioned memory area (mainly for gcc to use to create builtins).
>
> Testcases for the API functions were also created.
>
> Tested on ppc32, ppc64 and ppc64le.
>
> Okay to commit?
>
> Thanks,
>
Besides the documentation missing pointed by Joseph, some comments below.
> @@ -203,6 +214,32 @@ register void *__thread_register __asm__
> # define THREAD_SET_TM_CAPABLE(value) \
> (THREAD_GET_TM_CAPABLE () = (value))
>
> +/* hwcap & hwcap2 fields in TCB head. */
> +# define THREAD_GET_HWCAP() \
> + (((tcbhead_t *) ((char *) __thread_register \
> + - TLS_TCB_OFFSET))[-1].hwcap)
> +# define THREAD_SET_HWCAP(value) \
> + if (value & PPC_FEATURE_ARCH_2_06) \
> + value |= PPC_FEATURE_ARCH_2_05 | \
> + PPC_FEATURE_POWER5_PLUS | \
> + PPC_FEATURE_POWER5 | \
> + PPC_FEATURE_POWER4; \
> + else if (value & PPC_FEATURE_ARCH_2_05) \
> + value |= PPC_FEATURE_POWER5_PLUS | \
> + PPC_FEATURE_POWER5 | \
> + PPC_FEATURE_POWER4; \
> + else if (value & PPC_FEATURE_POWER5_PLUS) \
> + value |= PPC_FEATURE_POWER5 | \
> + PPC_FEATURE_POWER4; \
> + else if (value & PPC_FEATURE_POWER5) \
> + value |= PPC_FEATURE_POWER4; \
This very logic is already presented at other powerpc32 sysdep file [1].
Instead of duplicate the logic, I think it is better to move it in a common
file.
[1] sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
> Index: glibc-working/sysdeps/powerpc/sys/platform/ppc.h
> ===================================================================
> --- glibc-working.orig/sysdeps/powerpc/sys/platform/ppc.h
> +++ glibc-working/sysdeps/powerpc/sys/platform/ppc.h
> @@ -23,6 +23,86 @@
> #include <stdint.h>
> #include <bits/ppc.h>
>
> +
> +/* Get the hwcap/hwcap2 information from the TCB. Offsets taken
> + from tcb-offsets.h. */
> +static inline uint32_t
> +__ppc_get_hwcap (void)
> +{
> +
> + uint32_t __tcb_hwcap;
> +
> Index: glibc-working/sysdeps/powerpc/sys/platform/ppc.h
> ===================================================================
> --- glibc-working.orig/sysdeps/powerpc/sys/platform/ppc.h
> +++ glibc-working/sysdeps/powerpc/sys/platform/ppc.h
> @@ -23,6 +23,86 @@
> #include <stdint.h>
> #include <bits/ppc.h>
>
> +
> +/* Get the hwcap/hwcap2 information from the TCB. Offsets taken
> + from tcb-offsets.h. */
> +static inline uint32_t
> +__ppc_get_hwcap (void)
> +{
> +
> + uint32_t __tcb_hwcap;
> +
> +#ifdef __powerpc64__
> + register unsigned long __tp __asm__ ("r13");
> + __asm__ volatile ("lwz %0,-28772(%1)\n"
> + : "=r" (__tcb_hwcap)
> + : "r" (__tp));
> +#else
> + register unsigned long __tp __asm__ ("r2");
> + __asm__ volatile ("lwz %0,-28724(%1)\n"
> + : "=r" (__tcb_hwcap)
> + : "r" (__tp));
> +#endif
> +
> + return __tcb_hwcap;
> +}
There is no need to use underline names inside inline functions. I would also
change to something more simple like:
#ifdef __powerpc64__
# define __TPREG "r13"
# define __HWCAP1OFF -28772
#else
# define __TPREG "r2"
# define __HWCAP1OFF -28724
#else
static inline uint32_t
__ppc_get_hwcap (void)
{
uint32_t tcb_hwcap;
register unsigned long tp __asm__ (__TPREG);
__asm__ ("lwz %0, %1(%2)\n"
: "=r" (tcb_hwcap)
: "i" (__HWCAPOFF), "r" (tp));
return tcp_hwcap;
}
I also think the volatile in asm is not required (there is no need to refrain
compiler to possible optimize this load inside the inline function itself).
> Index: glibc-working/sysdeps/powerpc/test-get_hwcap.c
> ===================================================================
> --- /dev/null
> +++ glibc-working/sysdeps/powerpc/test-get_hwcap.c
The test are not wrong, but you could make only on test for this functionality,
instead of splitting the set and get in different ones.