This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: ELF octets_per_byte
- From: "Maciej W. Rozycki" <macro at linux-mips dot org>
- To: dgisselq at ieee dot org
- Cc: binutils at sourceware dot org
- Date: Thu, 25 Feb 2016 01:47:12 +0000 (GMT)
- Subject: Re: ELF octets_per_byte
- Authentication-results: sourceware.org; auth=none
- References: <1456242622 dot 30661 dot 448 dot camel at jericho>
On Tue, 23 Feb 2016, Dan wrote:
> For the purpose of beginning a discussion, and based upon a reading of
> the ELF specification, I propose the following values be in units of
> "octets":
>
> section size
> section header size
> section header offset
>
> For the most part, these values *must* be in octets, or it will be
> impossible to read and process an ELF file.
These express structures in a file as seen on a storage medium. I think
this pretty much mandates that they are expressed in octets, as file
structure representation has to be consistent among ELF targets so that
files can be handled in a portable manner, as you have correctly observed.
Please also note that the ELF gABI[1] is very explicit about a byte being
8-bits wide:
"As described here, the object file format supports various processors
with 8-bit bytes and either 32-bit or 64-bit architectures.
Nevertheless, it is intended to be extensible to larger (or smaller)
architectures. Object files therefore represent some control data with a
machine-independent format, making it possible to identify object files
and interpret their contents in a common way. Remaining data in an object
file use the encoding of the target processor, regardless of the machine
on which the file was created."
so whenever it refers to a "byte" I think it really means an octet,
although I do see an ambiguity here as sometimes it uses the term to mean
a target byte.
> I also propose that the following values are in units of target address
> space "bytes":
>
> ELF header "entry" address
> section header address
> symbol value
> symbol size
> relocation offset
> relocation addend
These express target addresses or are directly related to them (e.g.
offsets) and therefore I'm sure they're best expressed in whatever format
your target uses. These IMHO certainly qualify as "remaining data"
referred to in the gABI citation included above.
So with the entry point for example I'd expect whatever representation a
function pointer stored in memory would have on your target if the
function pointed was the intended entry point. Likewise with VMAs and
LMAs used in program headers, section headers, symbol tables, etc.
These do not necessarily have to be "proper" memory addresses even, for
example the MIPS processor encodes the execution mode in bit #0 of code
addresses, so in certain cases the entry point in MIPS ELF binaries will
have bit #0 set even though the memory location referred will have this
bit clear. So it's really up to you to decide whatever encoding is the
most appropriate for your architecture.
As to the symbol size I think it needs to be set to whatever the
C-language's `sizeof' operator would return for a unit of storage of the
same size.
References:
[1] "System V Application Binary Interface" - DRAFT - 10 June 2013,
Section "Data Representation"
<http://www.sco.com/developers/gabi/latest/ch4.intro.html#data_representation>
HTH,
Maciej