This is the mail archive of the
mailing list for the binutils project.
Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space
On 15/11/16 16:18, Jakub Jelinek wrote:
On Tue, Nov 15, 2016 at 04:00:40PM +0000, Jiong Wang wrote:
Takes one signed LEB128 offset and retrieves 8-byte contents from the address
calculated by CFA plus this offset, the contents then authenticated as per A
key for instruction pointer using current CFA as salt. The result is pushed
onto the stack.
I'd like to point out that especially the vendor range of DW_OP_* is
extremely scarce resource, we have only a couple of unused values, so taking
3 out of the remaining unused 12 for a single architecture is IMHO too much.
Can't you use just a single opcode and encode which of the 3 operations it is
in say the low 2 bits of a LEB 128 operand?
We'll likely need to do RSN some multiplexing even for the generic GNU
opcodes if we need just a few further ones (say 0xff as an extension,
followed by uleb128 containing the opcode - 0xff).
In the non-vendor area we still have 54 values left, so there is more space
for future expansion.
Seperate DWARF operations are introduced instead of combining all of them into
one are mostly because these operations are going to be used for most of the
functions once return address signing are enabled, and they are used for
describing frame unwinding that they will go into unwind table for C++ program
or C program compiled with -fexceptions, the impact on unwind table size is
significant. So I was trying to lower the unwind table size overhead as much as
IMHO, three numbers actually is not that much for one architecture in DWARF
operation vendor extension space as vendors can overlap with each other. The
only painful thing from my understand is there are platform vendors, for example
"GNU" and "LLVM" etc, for which architecture vendor can't overlap with.
For DW_OP_*, there aren't two vendor ranges like e.g. in ELF, there is just
one range, so ideally the opcodes would be unique everywhere, if not, there
is just a single GNU vendor, there is no separate range for Aarch64, that
can overlap with range for x86_64, and powerpc, etc.
Perhaps we could declare that certain opcode subrange for the GNU vendor is
architecture specific and document that the meaning of opcodes in that range
and count/encoding of their arguments depends on the architecture, but then
we should document how to figure out the architecture too (e.g. for ELF
base it on the containing EM_*). All the tools that look at DWARF (readelf,
objdump, eu-readelf, libdw, libunwind, gdb, dwz, ...) would need to agree on that
I know nothing about the aarch64 return address signing, would all 3 or say
2 usually appear together without any separate pc advance, or are they all
going to appear frequently and at different pcs?
I think it's the latter, the DW_OP_AARCH64_paciasp and
DW_OP_AARCH64_paciasp_deref are going to appear frequently and at different pcs.
For example, the following function prologue, there are three instructions
at 0x0, 0x4, 0x8.
After the first instruction at 0x0, LR/X30 will be mangled. The "paciasp" always
mangle LR register using SP as salt and write back the value into LR. We then generate
DW_OP_AARCH64_paciasp to notify any unwinder that the original LR is mangled in this
way so they can unwind the original value properly.
After the second instruction at 0x4, The mangled value of LR/X30 will be pushed on
to stack, unlike usual .cfi_offset, the unwind rule for LR/X30 becomes: first fetch the
mangled value from stack offset -16, then do whatever to restore the original value
from the mangled value. This is represented by (DW_OP_AARCH64_paciasp_deref, offset).
0x0 paciasp (this instruction sign return address register LR/X30)
.cfi_val_expression 30, DW_OP_AARCH64_paciasp
0x4 stp x29, x30, [sp, -32]!
.cfi_val_expression 30, DW_OP_AARCH64_paciasp_deref, -16
.cfi_offset 29, -32
0x8 add x29, sp, 0
Perhaps if there is just 1
opcode and has all the info encoded just in one bigger uleb128 or something