[Bug gdb/22869] set architecture i8068 has no effect on disassembly

Tue Mar 19 17:06:00 GMT 2019

https://sourceware.org/bugzilla/show_bug.cgi?id=22869

--- Comment #5 from mat.matans at gmail dot com ---
I encountered this issue a few day ago and started investigating, this is what
I could come up with:

Disclaimer: I only tested this with recent versions of gdb (8+) against qemu, I
haven't tested it against a real machine running in real-mod. I assume this is
the most common configuration nowadays.

The issue isn't actually purely gdb fault, it's a combination of qemu (or the
target in general) declaring itself as i386. Specifically qemu, when you attach
to its gdbserver provide a qXfer response for i386 architectures, even though
you start in real-mode. The response looks something like this:

>> <?xml version="1.0"?>
>> <!DOCTYPE target SYSTEM "gdb-target.dtd">
>> <target>
>> <architecture>i386</architecture>
>> <xi:include href="i386-32bit.xml"/>
>> </target>
The important part being the architecture tag.

Normally you'd expect "set architecture ..." to overrule the target
information, but in the case of i8086 and i386 they are both considered
compatible - explained next.

The 'bfd_arch_info's for i8086 and i386 (bfd/cpu-i386.c) both use the
'bfd_i386_compatible' as the 'compatible' function, which is a light wrapper
around 'bfd_default_compatible' with some extra handling to avoid mixing x86
and x86_64. The function looks something like this (bfd/archures.c):

>> if (a->arch != b->arch)
>>    return NULL;
>>
>>  if (a->bits_per_word != b->bits_per_word)
>>    return NULL;
>>
>>  if (a->mach > b->mach)
>>    return a;
>>
>>  if (b->mach > a->mach)
>>    return b;
>>
>>  return a;

The idea here is that if 2 architectures share the same 'arch' and are
"word-size-compatible", the one with the higher machine arch (mach) is take as
a compatible superset of the other. This mostly correct for i8086 and i386, but
the default operand size is different.

If we go back to gdb we can see where this "compatibility" is an issue
(choose_architecture_for_target in gdb/arch-utils.c):

>> /* BFD's 'A->compatible (A, B)' functions return zero if A and B are
>>      incompatible.  But if they are compatible, it returns the 'more
>>      featureful' of the two arches.  That is, if A can run code
>>      written for B, but B can't run code written for A, then it'll
>>      return A.
>>      Some targets (e.g. MIPS as of 2006-12-04) don't fully
>>      implement this, instead always returning NULL or the first
>>      argument.  We detect that case by checking both directions.  */
>> 
>>   compat1 = selected->compatible (selected, from_target);
>>   compat2 = from_target->compatible (from_target, selected);

This function gets called when you issue the "set architecture ..." command,
selected is the newly selected arch (i8086) and from_target is the one
advertised by the target (qemu). both "compat1" and "compat2" are i386 because
it is the superset.

The final piece of the puzzle is how bfd chooses the default operand size (it
is not the same as the word size in 'bfd_arch_info'). This happens in
"opcodes/i386-dis.c":

>>   ...
>>   else if (info->mach == bfd_mach_i386_i8086)
>>    {
>>      address_mode = mode_16bit;
>>      priv.orig_sizeflag = 0;
>>    }
>>   ...

So only 'bfd_mach_i386_i8086' gets the special default 16bit operand size, and
since we lost this information back in 'choose_architecture_for_target' we use
32bit by default. This is actually the only reference to i8086 in the
disassembler (i386-dis.c).

bfd actually provides a few mechanisms to use 16bit default operand under i386
with the  "i8086", "addr16" and "data16" flags (-M... in objdump),
unfortunately gdb does not allow disassembler options under i386.

----

I can't decide who is at fault here
* GDB seem to do the sane thing and take the superset arch, but shouldn't 'set
architecture' overrule everything?
* qemu is indeed emulating an i386 processor, even though it starts in
real-mode.
* bfd - are i8086 and i386 really all that compatible? If they are I wouldn't
expect differences

----

I actually found a workaround for the issue, if you make your own 'target.xml'
and the architecture to i8086 GDB will keep this setting.
This is the one I'm using:
https://gist.github.com/MatanShahar/1441433e19637cf1bb46b1aa38a90815

----

I haven't had the chance to test it on gdb <7.9 to see the issue is still there
or not, but I doubt it's new, most of the code I went through is 9 years old.
I will hopefully find some time later today

-- 
You are receiving this mail because:
You are on the CC list for the bug.