This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: Prevent disassembly beyond symbolic boundaries

From: Nicholas Clifton <nickc at redhat dot com>
To: Tristan Gingold <gingold at adacore dot com>
Cc: binutils at sourceware dot org, gdb-patches at sourceware dot org
Date: Fri, 19 Jun 2015 12:41:34 +0100
Subject: Re: RFC: Prevent disassembly beyond symbolic boundaries
Authentication-results: sourceware.org; auth=none
References: <87lhfhynoz dot fsf at redhat dot com> <3D81F97D-90EA-4769-8381-514BB6E81E3F at adacore dot com>

Hi Tristan,

  This will disassemble as:

    0000000000000000 <foo>:
       0:   24 2f                   and    $0x2f,%al
       2:   83 0f ba                orl    $0xffffffba,(%rdi)

    0000000000000003 <bar>:
       3:   0f ba e2 03             bt     $0x3,%edx

  Note how the instruction decoded at address 0x2 has stolen two bytes
  from "foo", but these bytes are also decoded (correctly this time) as
  part of the first instruction of foo.

I am curious.  Why do you think it was a problem ?

Strangely enough, this actually causes regressions with the perf tool'stestsuite:


  https://bugzilla.redhat.com/show_bug.cgi?id=1054767

What happens is that perf test 21 runs objdump on a binary, *parses*this output and compares that to the actual bytes in the binary.Because of the overrun feature shown above you actually get more bytesdisplayed in objdump's output than actually exist in the binary and sothe perf test fails.

Even if there is a symbol in the middle of an instruction, Iâd like
to understand what the processor will execute.

Except that even the current the displayed disassembly is not what theprocessor would execute. In the example above the processor wouldexecute the ORL instruction starting at address 0x2. but it would notcontinue on to execute the BT instruction at address 0x3. Instead itwould start decoding from address 0x5, whatever instruction that might be...

 Before the proposed
change, it was possible, but after it isnât easy anymore.

True - but this only matters if the processor would execute from thatpiece of memory. What if the byte(s) are actually data ? (eg aconstant pool). Then it would make more sense to display the bytes asjust byte values.

The point being that if there is a symbol that is in the middle of aninstruction then something hinky is going on. Either the symbol ismisplaced or the instruction is not really an instruction or else anassembly programmer is being extra super clever and hiding data insideinstructions.

How about a tweak to the patch then ? What if the -D option(disassemble all) disables this feature, and so the disassembledinstruction is displayed as before, whilst the -d option (disassemblecode) leaves it enabled. Then if you want to see bytes as instructionsyou can use the -D option (possibly combined with -j), but if you wantto see a more likely, only real instructions disassembled version, thenuse the -d option. (Obviously the patch would need to be extended withan update to the documentation too).


Cheers
  Nick

Follow-Ups:
- Re: RFC: Prevent disassembly beyond symbolic boundaries
  - From: Erik Christiansen
- Re: RFC: Prevent disassembly beyond symbolic boundaries
  - From: Tristan Gingold

References:
- RFC: Prevent disassembly beyond symbolic boundaries
  - From: Nick Clifton
- Re: RFC: Prevent disassembly beyond symbolic boundaries
  - From: Tristan Gingold

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]