This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] dl-procinfo and HWCAP_IMPORTANT support for powerpc


The recently committed dl-procinfo support for powerpc provides names for
AT_HWCAP bits as of 2.6.15. This update also defines HWCAP_IMPORTANT
support that allows for HWCAP based extensions to the library search path.
The intent is to allow the loader (ld.so) to select the most appropriate
version of a library given the hardware we are running on. This is
equivalent the support for i686 optimized libraries for the i386 platform.

The 2.6.15 kernel defines the following AT_HWCAP bits:

#define PPC_FEATURE_32              0x80000000 /* 32-bit mode. */
#define PPC_FEATURE_64              0x40000000 /* 64-bit mode. */
#define PPC_FEATURE_601_INSTR       0x20000000 /* 601 chip, Old POWER ISA.
*/
#define PPC_FEATURE_HAS_ALTIVEC     0x10000000 /* SIMD/Vector Unit.  */
#define PPC_FEATURE_HAS_FPU         0x08000000 /* Floating Point Unit.  */
#define PPC_FEATURE_HAS_MMU         0x04000000 /* Memory Management Unit.
*/
#define PPC_FEATURE_HAS_4xxMAC      0x02000000 /* 4xx Multiply Accumulator.
*/
#define PPC_FEATURE_UNIFIED_CACHE   0x01000000 /* Unified I/D cache.  */
#define PPC_FEATURE_HAS_SPE         0x00800000
#define PPC_FEATURE_HAS_EFP_SINGLE  0x00400000
#define PPC_FEATURE_HAS_EFP_DOUBLE  0x00200000
#define PPC_FEATURE_NO_TB           0x00100000 /* 601/403gx have no
timebase */
#define PPC_FEATURE_POWER4          0x00080000 /* POWER4 microarch level */
#define PPC_FEATURE_POWER5          0x00040000 /* POWER5 microarch level */
#define PPC_FEATURE_POWER5_PLUS     0x00020000 /* POWER5+ microarch level
*/
#define PPC_FEATURE_CELL            0x00010000 /* CELL PU microarch level
*/

These file have been given the following procinfo names
      "ppc32",
      "ppc64",
      "ppc601",
      "altivec",
      "fpu",
      "mmu",
      "4xxmac",
      "ucache",
      "spe",
      "efpsingle",
      "efpdouble",
      "notb",
      "power4",
      "power5",
      "power5+",
      "cell"

The last 4 names represent architecture feature levels with a corresponding
ISA (see Rationale: below for details). The proposed HWCAP_IMPORTANT mask
is:

+#define HWCAP_IMPORTANT      (PPC_FEATURE_HAS_ALTIVEC                  \
+                                   | PPC_FEATURE_POWER4
\
+                                   | PPC_FEATURE_POWER5
\
+                                   | PPC_FEATURE_POWER5_PLUS
\
+                                   | PPC_FEATURE_CELL)

This is the minimum set for defining unique micro-architectual or ISA
features. The dl-procinfo implementation uses this information to augment
the library search list. The proposed correspondence between processors and
runtime library search directories (assuming nptl and 32-bit) are:

processor     library search
==========    ============
power4        /lib/tls/power4, /lib/tls, /lib
power5        /lib/tls/power5, /lib/tls, /lib
power5+       /lib/tls/power5+, /lib/tls, /lib
970           /lib/tls/altivec/power4, /lib/tls/altivec, /lib/tls, /lib
cell          /lib/tls/cell, /lib/tls, /lib

Similarly for 64-bit and /lib64. Since linuxthreads is deprecated the
directory structure may be simplified (eliminating the tls level of the
directory). If linuxthreads is still supported it is possible to only
support only one implementation of linuxthreads and support optimized
libraries only for nptl. In this case the LD_ASSUME_KERNEL and ABI note can
be used to simplify the directory structure (as in i386/i686).

Note: The additional "altivec" level is for the 970 is an artifact of
encoding the 970 with 2 bits in the AT_HWCAP. The "altivec" directory can
be used to store Altivec optimized libraries, including 32-bit G4
implementations.

Rationale:

The power4 and 970 implement the full 64-bit PowerPC Version 2.0 ISA
including the "optional" "General Purpose" and "Graphics" groups (for
example; fsqrt and fsqrts). The power4 processors are more aggressively
pipelined with out-of-order issue to 8 pipelines, and implements a weakly
consistent storage model. More importantly the Fixed Point, Floating
Pointer and Load/Store units are paired and symmetrical. This leads to a
optimizations that execute more instructions (in parallel) to get shorter
execution (fewer total cycles). These optimization might actually run
slower on older processors (which can't support the same level of parallel
execution) but are necessary to get full performance out of the power4.

Previous (power3, G4) PowerPC implementations implemented the older Version
1.x ISA, had fewer pipelines, may not implement the optional instructions,
and/or implement a strongly consistent storage model. So knowing that we
are running on a power4 (or newer) processor is very useful information.

The power5 processor implements the full 64-bit PowerPC Version 2.02 ISA
(adds the popcntb, fre, frsqrtes instructions). The power5 also has a
deeper storage queue (stores are more out-of-order then power4). The
power5+ processor implements the 64K page support and 4 additional FP
instructions.

The 970 chip (Apple G5, IBM JS20 blade) is represented as "power4" with the
"altivec" modifier. This is appropriate because the current 970 micro
architecture design was derived from the power4+ design with the addition
of the VMX unit. So fixed point and floating point (non-VMX) optimizations
for power4 are applicable to 970.

We have made a deliberate decision not to identify older processor
generations (POWER3, G4). They are no longer in production and/or are
adequately covered by the current glibc implementation. They will continue
to be supported by current base/default implementation (as defined by gcc
options -mcpu=powerpc for 32-bit and -mcpu=powerpc64 for 64-bit).



Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]