This is the mail archive of the
binutils@sources.redhat.com
mailing list for the binutils project.
Re: Advice needed on when to synthesize <sym>.high_bound in ld
- To: Nick Clifton <nickc at redhat dot com>
- Subject: Re: Advice needed on when to synthesize <sym>.high_bound in ld
- From: Greg McGary <greg at mcgary dot org>
- Date: 31 Aug 2000 15:17:21 -0700
- Cc: binutils at sources dot redhat dot com
- References: <200008302344.QAA05515@elmo.cygnus.com>
Nick Clifton <nickc@redhat.com> writes:
> Hmm, I am not entirely sure that synthesising new symbols is the right
> way to go. As I see it there are a couple of problems:
>
> * They will grow the symbol table. Although presumably enabling
> bounded pointer checking significantly grows the code being
> produced by gcc, so the extra overhead of more symbols in the
> symbol table is probably not that important.
Right now, code bloat runs 100%..200% (2x..3x). Once gcc gets a
couple new optimizations, bloat will drop to around 50% (1.5x). Also,
high-bound symbols are only required for references to extern arrays
and variables with incomplete struct types, which are not the normal
case, so the symbol table should only grow by a small percentage of
the total data+bss symbol count. So, in short, no I don't expect that
symbol-table bloat will be significant.
> * It "feels" wrong.
I agree, now that you mention it.
> Presumably foo.high_bound is meant to refer to
> the upper address of the space pointed to by symbol foo, so that
> you can generate code like:
>
> if (ptr > foo.high_bound)
> bound_check_abort ();
>
> foo.high_bound is not really a symbol in its own right, but rather
> a property of foo.
Yes, on all counts.
> I think that the right way therefore would be
> to express foo.high_bound as a reloc based on the symbol foo.
Fine.
> With this scheme the compiler would generate code like this:
>
> if (ptr > _high_bound (foo))
> bound_check_abort ();
I understand that this is probably pseudo code, since you can't really
mean that _high_bound is a function. FYI, this could be expressed in
concrete code like so:
if (ptr >= __ptrhigh &foo)
bound_check_abort ();
> .word _high_bound (foo)
This looks as though you're allocating initialized data with the value
of the high bound. That shouldn't be necessary.
> Where _high_bound() is an assembler pseudo op to generate a
> R_BFD_HIGH_BOUND reloc for the named symbol.
Cool. Is there precedent for this syntax? The gas manual lists
only line-oriented `.<op>' style psuedos.
> Of course using relocs would involve changing both the assembler
> and linker (and probably involve target specific changes in almost
> all of the ELF based GAS targets) which would mean more
> opportunities for bugs to creep in, so you would have to consider
> whether this approach is worthwhile.
It's a tough call. I like the high_bound pseudo-op idea as
conceptually cleaner, but it touches more programs. I don't like
faking symbols, but it's less trouble since only LD is involved.
Since you seem to appreciate the tradeoffs and haven't expressed
strong preference for either one, I guess I'll just pick the
easiest-to-implement one (fake `*.gnu_bp_high_bound' symbols). If
there's compelling reason to shift to the _high_bound pseudo op we can
always do that later and phase out the fake symbols after a binutils
release cycle or two. OK?
> ... Also I would recommend changing the name from foo.high_bound to
> something that makes it more obvious that this is a GNU extension,
> eg foo.gnu_bp_high_bound.
Very good. I had wanted an answer to that specific question.
Thanks for your help,
Greg