[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: Program Properties



On Wed, Oct 26, 2016 at 11:15 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Oct 25, 2016 at 4:46 AM, Maciej W. Rozycki <macro@imgtec.com> wrote:
>> On Mon, 17 Oct 2016, H.J. Lu wrote:
>>
>>> > 4. A reject flag: if such annotated the ABI flag requires explicit support
>>> >    (special handling beyond the three variants above) and linking fails if
>>> >    it is set in any input object and the linker does know this ABI flag.
>>>
>>> "reject" isn't very clear.  Is "mandatory" better?
>>
>>  Such property seen by the component addressed (be it the static linker,
>> dynamic loader or OS kernel) would cause the binary to be rejected unless
>> already explicitly recognised by the component.  Or IOW unknown such
>> properties would be rejected and known ones handled as required.  Hence
>> the name proposed.
>>
>>  That written, having thought about it some more, I think we don't
>> actually need such an explicit flag as I think we can reasonably set this
>> semantics as the default.  That is any unknown property *not* annotated
>> with one of the known flags would be rejected, making an explicit "reject"
>> flag redundant.
>>
>>> > Such annotation would of course have to be consistent across input files.
>>> >
>>> >  Such ABI flag flags would allow ABIs to define new ABI flags processed
>>> > automatically in static linking without the need to upgrade the linker
>>> > each time a flag is added.
>>> >
>>> >  Thoughts?
>>>
>>> Property values can be divided into ranges of different rules, including
>>> rules which differ from above.
>>
>>  I'm not sure defining fixed ranges has an advantage over using property
>> annotation.  I think it's hard to assess beforehand how many values we may
>> need in each range and if we make a range allocated too narrow, then we
>> risk running out of entries within, whereas if we make one too broad, then
>
> We can add another property note if we run out of property types.
>
>> we risk running out of the allocation space.  On the other hand by using
>> explicit property annotation we will only have consumed as much of the
>> allocation space as has actually been defined at any point in time.
>>
>>  Have I missed anything?
>>
>
> The question is where annotation is stored.  It is either stored in
> property type or property data.  If it is encoded in type, it will limit
> number of usable types.  It it hard to tell how many types will
> be needed in the future.  I can't image that we need hundreds
> of property types in a file and we can always add a new note if needed.
> If it is encoded in data, it should be stored in the first few bytes,
> which increases data size or make run-time processing less efficient
> because of little endian vs big endian.
>
> We can first identify how many different annotations we need and
> figure out what the best way to encode them for both extensibility
> as well as run-time efficiency.
>

Here is the updated proposal with annotations in program
property types.  If we run out if types, we can add another
property notes.


-- 
H.J.
---
Program Properties

There are cases where linker and run-time loader need more information
about ELF objects beyond what the current gABI provides:

1. Minimum ISAs.  Executables and shared objects, which are optimized
specifically to run on a particular processor, will not run on processors
which don't support the same set of ISAs.  Since x86 only has EM_IAMCU,
EM_386 and EM_X86_64 ELF machine codes, run-time loader needs additional
information to tell if an executable or a shared object is compatible
with available ISAs.
2. Stack size.  Compilers may generate binaries which require larger stack
size than normal.  If run-time loader can query the stack size required
by executable or shared object, it can increase stack size as needed.
3. Copy relocation and protected visibility are fundamentally incompatible.
On one hand, copy relocation is the part of the psABI and is used to
access global data defined in a shared object from the executable.  It
moves the definition of global data, which is defined in a share object,
to the executable at run-time.  On the other hand, protected visibility
indicates that a symbol is defined locally in the shared object at
run-time.  Both can't be true at the same time.  The current solution
is to make protected symbol more or less like normal symbol, which
prevents optimizing local access to protected symbol within the shared
object.

GNU attributes

GNU binutils supports build attribute and run-time platform compatibility
data in relocatable object files.  Issues with GNU attributes:

1. Many architectures, including x86, don't support GNU attributes.
2. On x86, linking a relocatable object full of AVX instructions doesn't
always make the resulting executables or shared libraries to require AVX
to run since AVX functions may be called only via GNU_IFUNC at run-time.
Linker can't set minimum ISAs just from ISAs used by input relocatable
objects.
3. There is no program segment for GNU attributes in executables and
shared objects.
4. Most of attributes aren't applicable to run-time loader.
5. The format of GNU attributes isn't optimal for run-time loader.  A
separate string table is used to store string attributes.

gABI support for program properties

To the "Special Sections" section, add:

     Name              Type                 Attributes
.note.gnu.property    SHT_NOTE              SHF_ALLOC

A .note.gnu.property section contains at least one property note
descriptor, starting with a property note descriptor header and
followed by an array of properties.  The property note descriptor
header has the following structure:

typedef struct {
  Elf_Word namsz;
  Elf_Word descsz;
  Elf_Word type;
  unsigned char name[4];
} Elf_GNU_Notehdr;

1. namesz is 4.
2. descsz contains the size of the property array.
3. type specifies the property type:

#define NT_GNU_PROPERTY_TYPE_0   5

4. name is a null-terminated character string. It should be "GNU".

Each array element represents one property with type, data size and data.
In 64-bit objects, each element is an array of 8-byte words, whose first
element is 4-byte type and data size, in the format of the target processor.
In 32-bit objects, each element is an array of 4-byte words, whose first 2
elements are 4-byte type and data size, in the format of the target
processor.  An array element has the following structure:

typedef struct {
  Elf_Word pr_type;
  Elf_Word pr_datasz;
  unsigned char pr_data[PR_DATASZ];
  unsigned char pr_padding[PR_PADDING];
} Elf_Prop;

where PR_DATASZ is the data size and PR_PADDING, if necessary, aligns
array element to 8 or 4-byte alignment (depending on whether the file
is a 64-bit or 32-bit object).  The array elements are sorted by the
property type.  The interpretation of property array depends on both
ph_kind and pr_type.

Types of program properties

The last 3 bits of program property indicate how it should be
processed.

#define GNU_PROPERTY_TYPE_SHIFT    3
#define GNU_PROPERTY_TYPE_MASK     (-(1 << GNU_PROPERTY_TYPE_SHIFT))
#define GNU_PROPERTY_EVAL_MASK     ((1 << GNU_PROPERTY_TYPE_SHIFT) - 1)

#define GNU_PROPERTY_EVAL_REQ      0

Linker should refuse to generate output if input property type is
unknown.

#define GNU_PROPERTY_EVAL_EQ       1

Linker should refuse to generate output if input property data aren't
identical.

#define GNU_PROPERTY_EVAL_OR       2

Output property data is logical OR of input property data.

#define GNU_PROPERTY_EVAL_AND      3

Output property data is logical AND of input property data.

Linker should refuse to generate output for other evaluation values in
input property type.

#define GNU_PROPERTY_LOPROC        0xb0000000
#define GNU_PROPERTY_HIPROC        (0xdfffffff&GNU_PROPERTY_TYPE_MASK)
#define GNU_PROPERTY_LOUSER        0xe0000000
#define GNU_PROPERTY_HIUSER        (0xffffffff&GNU_PROPERTY_TYPE_MASK)

Proposed properties

For NT_GNU_PROPERTY_TYPE_0:

#define GNU_PROPERTY_STACK_SIZE \
 ((1 << GNU_PROPERTY_TYPE_SHIFT)|GNU_PROPERTY_EVAL_REQ)

Integer value for minimum stack size whose is 8 bytes in 64-bit object
and 4 bytes in 32-bit object.

#define GNU_PROPERTY_NO_COPY_ON_PROTECTED \
 ((2 << GNU_PROPERTY_TYPE_SHIFT)|GNU_PROPERTY_EVAL_REQ)

Its pr_datasz is 0.  This indicates that there should be no copy
relocations against protected data symbols.  If a relocatable object
contains this property, linker should treat protected data symbol as
defined locally at run-time and copy this property to the output share
object.  Run-time loader should disallow copy relocations against
protected data symbols defined in share objects with
GNU_PROPERTY_NO_COPY_ON_PROTECTED property.

#define GNU_PROPERTY_X86_ISA_1_USED \
  ((0 << GNU_PROPERTY_TYPE_SHIFT)|GNU_PROPERTY_LOPROC|GNU_PROPERTY_EVAL_OR)

The x86 instruction sets indicated by the corresponding bits are used
in program.  But their support in the hardware is optional.

#define GNU_PROPERTY_X86_ISA_1_NEEDED \
  ((1 << GNU_PROPERTY_TYPE_SHIFT)|GNU_PROPERTY_LOPROC|GNU_PROPERTY_EVAL_OR)

The x86 instruction sets indicated by the corresponding bits are used
in program and they must be supported by the hardware.  A bit set in
GNU_PROPERTY_X86_ISA_1_NEEDED must also be set in
GNU_PROPERTY_X86_ISA_1_USED.

4-byte integer value for the x86 instruction set support.

#define GNU_PROPERTY_X86_ISA_1_486           (1U << 0)
#define GNU_PROPERTY_X86_ISA_1_586           (1U << 1)
#define GNU_PROPERTY_X86_ISA_1_686           (1U << 2)
#define GNU_PROPERTY_X86_ISA_1_SSE           (1U << 3)
#define GNU_PROPERTY_X86_ISA_1_SSE2          (1U << 4)
#define GNU_PROPERTY_X86_ISA_1_SSE3          (1U << 5)
#define GNU_PROPERTY_X86_ISA_1_SSSE3         (1U << 6)
#define GNU_PROPERTY_X86_ISA_1_SSE4_1        (1U << 7)
#define GNU_PROPERTY_X86_ISA_1_SSE4_2        (1U << 8)
#define GNU_PROPERTY_X86_ISA_1_AVX           (1U << 9)
#define GNU_PROPERTY_X86_ISA_1_AVX2          (1U << 10)
#define GNU_PROPERTY_X86_ISA_1_AVX512F       (1U << 11)
#define GNU_PROPERTY_X86_ISA_1_AVX512CD      (1U << 12)
#define GNU_PROPERTY_X86_ISA_1_AVX512ER      (1U << 13)
#define GNU_PROPERTY_X86_ISA_1_AVX512PF      (1U << 14)
#define GNU_PROPERTY_X86_ISA_1_AVX512VL      (1U << 15)
#define GNU_PROPERTY_X86_ISA_1_AVX512DQ      (1U << 16)
#define GNU_PROPERTY_X86_ISA_1_AVX512BW      (1U << 17)
#define GNU_PROPERTY_X86_ISA_1_ENDBR         (1U << 18)