This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix the .align bug with unwind info


In case it isn't clear to others, we need to defer emitting unwind info
until after relaxation.  Otherwise, we can not correctly compute it in
some cases, e.g. in the presence of .align directives in the code stream
for aligning branches.

HJ is proposing putting the unwind info into a frag, which seems to work
nicely, except for the problem of estimating the size of the unwind
info.  We can't compute the size until after relaxation, and the worst
case size before relaxation is the size of the address space, which is
not a useful estimate for a frag size.  So we need to make a good
practical estimate on the maximum size, or we need to look for another
solution.

Using a variant frag is how the dwarf2 line number info solves the exact
same problem, which is why we are looking at it here.

On Mon, 2003-12-22 at 12:16, H. J. Lu wrote:
> If we have to estimate the size anyway, we don't need to add a bunch
> of new variant frags. We just initialize the variant frag with a
> reasonable size. The only thing which needs a limit is imask. What is
> its reasonable limit for prologue rlen?

I was thinking we could compute a theoretical limit because there is a
limit on how many registers we can save.  Worst case, we save 100 grs,
20 frs, 5 branch regs, and some misc regs like pfs, rp, lc, unat, and
the predicate registers.  Call that 130 registers, double it to account
for worst case inefficient address arithmetic and/or padding nops, and
we have 260 instructions.

However, it seems that instruction scheduling makes this more
complicated.  Running readelf -u on all files in /usr/bin on a debian
system, I see that largest prologue is 288 instructions.  This one does
not save very many registers, and does not use an imask, but it appears
that instruction scheduling moved the rp register save into the
following block, making the prologue appear much larger than it is.  The
largest one that uses an imask is 160 instructions, and again, there was
movement of instructions into the prologue by the instruction
scheduler.  We might need to limit scheduling of the prologue to make
this work, which would be unfortunate.  Gcc doesn't provide any good way
to limit prologue scheduling without effectively disallowing any
scheduling at all.  Maybe we can be a bit more intelligent about the
unwind info that gcc emits?  I haven't looked into this.

I see quite a few examples that save all registers other than the 96
local grs, which means 34 registers.  This is probably due to the setjmp
register saving problem that I recently fixed.  This can be done in as
few as 65 instructions, it is almost always done in less than 80
instructions, so a factor of 2 seems a reasonable margin.  The minor
differences here are presumably the result of instruction scheduling
moving a few instructions into the prologue.

Maybe we should ask the question here of whether we ever need to
estimate the size of a prologue region.  Instruction scheduling will
never move a branch into a prologue region, so it is probably the case
that we will never have to defer an imask size calculation until after
relaxation.  We could give an error for this if we detect such a case to
be safe.  How about passing another argument to slot_index, which is the
unwind record type, and if the type is prologue or prologue_gr and there
is any kind of variable space allocation we give an error.

Then we only have to worry about estimating the size of body region
lengths, and this is leb128 (address space/16 * 3) which is 9 bytes
worst case I believe.  There are only a small number of unwind records
that are variable size in this scenario, so it shouldn't be a problem to
assume worst case sizes for them.  This would allow us to handle
rs_space and rs_org properly, though I doubt that this is very
important.

I think this can fail if we have a second prologue section that occurs
after the first body.  There might be legitimate reasons for this, for
instance describing optimized tail calls.  We wouldn't need any records
with an imask in this case though, we would only need the prologue
record.  We would need to handle this.  Maybe we need something a little
more complicated where slot_index sets a status indicator if it sees a
variable space allocation, and then we give an error only if we try to
emit a record that needs an imask field after a variable space
allocation has been seen.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]