This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: ELF octets_per_byte

From: "Maciej W. Rozycki" <macro at linux-mips dot org>
To: dgisselq at ieee dot org
Cc: binutils at sourceware dot org
Date: Thu, 25 Feb 2016 01:47:12 +0000 (GMT)
Subject: Re: ELF octets_per_byte
Authentication-results: sourceware.org; auth=none
References: <1456242622 dot 30661 dot 448 dot camel at jericho>

On Tue, 23 Feb 2016, Dan wrote:

> For the purpose of beginning a discussion, and based upon a reading of
> the ELF specification, I propose the following values be in units of
> "octets":
> 
> section size
> section header size
> section header offset
> 
> For the most part, these values *must* be in octets, or it will be
> impossible to read and process an ELF file.

 These express structures in a file as seen on a storage medium.  I think 
this pretty much mandates that they are expressed in octets, as file 
structure representation has to be consistent among ELF targets so that 
files can be handled in a portable manner, as you have correctly observed.

 Please also note that the ELF gABI[1] is very explicit about a byte being 
8-bits wide:

"As described here, the object file format supports various processors 
with 8-bit bytes and either 32-bit or 64-bit architectures.  
Nevertheless, it is intended to be extensible to larger (or smaller) 
architectures.  Object files therefore represent some control data with a 
machine-independent format, making it possible to identify object files 
and interpret their contents in a common way.  Remaining data in an object 
file use the encoding of the target processor, regardless of the machine 
on which the file was created."

so whenever it refers to a "byte" I think it really means an octet, 
although I do see an ambiguity here as sometimes it uses the term to mean 
a target byte.

> I also propose that the following values are in units of target address
> space "bytes":
> 
> ELF header "entry" address
> section header address
> symbol value
> symbol size
> relocation offset
> relocation addend

 These express target addresses or are directly related to them (e.g. 
offsets) and therefore I'm sure they're best expressed in whatever format 
your target uses.  These IMHO certainly qualify as "remaining data" 
referred to in the gABI citation included above.

 So with the entry point for example I'd expect whatever representation a 
function pointer stored in memory would have on your target if the 
function pointed was the intended entry point.  Likewise with VMAs and 
LMAs used in program headers, section headers, symbol tables, etc.

 These do not necessarily have to be "proper" memory addresses even, for 
example the MIPS processor encodes the execution mode in bit #0 of code 
addresses, so in certain cases the entry point in MIPS ELF binaries will 
have bit #0 set even though the memory location referred will have this 
bit clear.  So it's really up to you to decide whatever encoding is the 
most appropriate for your architecture.

 As to the symbol size I think it needs to be set to whatever the 
C-language's `sizeof' operator would return for a unit of storage of the 
same size.

References:

[1] "System V Application Binary Interface" - DRAFT - 10 June 2013, 
    Section "Data Representation"
<http://www.sco.com/developers/gabi/latest/ch4.intro.html#data_representation>

 HTH,

  Maciej

Follow-Ups:
- Re: ELF octets_per_byte
  - From: Dan

References:
- ELF octets_per_byte
  - From: Dan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]