This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Advice needed on when to synthesize <sym>.high_bound in ld


Nick Clifton <nickc@redhat.com> writes:

> Hmm, I am not entirely sure that synthesising new symbols is the right
> way to go. As I see it there are a couple of problems:
> 
>   * They will grow the symbol table.  Although presumably enabling
>     bounded pointer checking significantly grows the code being
>     produced by gcc, so the extra overhead of more symbols in the
>     symbol table is probably not that important.

Right now, code bloat runs 100%..200% (2x..3x).  Once gcc gets a
couple new optimizations, bloat will drop to around 50% (1.5x).  Also,
high-bound symbols are only required for references to extern arrays
and variables with incomplete struct types, which are not the normal
case, so the symbol table should only grow by a small percentage of
the total data+bss symbol count.  So, in short, no I don't expect that
symbol-table bloat will be significant.

>   * It "feels" wrong.

I agree, now that you mention it.

>     Presumably foo.high_bound is meant to refer to
>     the upper address of the space pointed to by symbol foo, so that
>     you can generate code like:
> 
> 	if (ptr > foo.high_bound)
>             bound_check_abort ();
> 
>     foo.high_bound is not really a symbol in its own right, but rather
>     a property of foo.

Yes, on all counts.

>     I think that the right way therefore would be
>     to express foo.high_bound as a reloc based on the symbol foo.

Fine.

>     With this scheme the compiler would generate code like this:
> 
>        if (ptr > _high_bound (foo))
>          bound_check_abort ();

I understand that this is probably pseudo code, since you can't really
mean that _high_bound is a function.  FYI, this could be expressed in
concrete code like so:

	if (ptr >= __ptrhigh &foo)
	  bound_check_abort ();

>        .word  _high_bound (foo)

This looks as though you're allocating initialized data with the value
of the high bound.  That shouldn't be necessary.

>     Where _high_bound() is an assembler pseudo op to generate a
>     R_BFD_HIGH_BOUND reloc for the named symbol.

Cool.  Is there precedent for this syntax?  The gas manual lists
only line-oriented `.<op>' style psuedos.

>     Of course using relocs would involve changing both the assembler
>     and linker (and probably involve target specific changes in almost
>     all of the ELF based GAS targets) which would mean more
>     opportunities for bugs to creep in, so you would have to consider
>     whether this approach is worthwhile.

It's a tough call.  I like the high_bound pseudo-op idea as
conceptually cleaner, but it touches more programs.  I don't like
faking symbols, but it's less trouble since only LD is involved.
Since you seem to appreciate the tradeoffs and haven't expressed
strong preference for either one, I guess I'll just pick the
easiest-to-implement one (fake `*.gnu_bp_high_bound' symbols).  If
there's compelling reason to shift to the _high_bound pseudo op we can
always do that later and phase out the fake symbols after a binutils
release cycle or two.  OK?

> ...  Also I would recommend changing the name from foo.high_bound to
> something that makes it more obvious that this is a GNU extension,
> eg foo.gnu_bp_high_bound.

Very good.  I had wanted an answer to that specific question.

Thanks for your help,
Greg

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]