This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x86: AT&T syntax operand size defaults


(re-adding list to Cc)

>>> On 15.11.17 at 14:30, <hjl.tools@gmail.com> wrote:
> On Wed, Nov 15, 2017 at 5:05 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 15.11.17 at 13:46, <hjl.tools@gmail.com> wrote:
>>> On Wed, Nov 15, 2017 at 2:32 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 16.10.17 at 13:24, <hjl.tools@gmail.com> wrote:
>>>>> On Mon, Oct 16, 2017 at 3:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>> On 13.10.17 at 23:51, <hjl.tools@gmail.com> wrote:
>>>>>>> On 10/13/17, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> All,
>>>>>>>>
>>>>>>>> according to the only reasonable document about AT&T assembler
>>>>>>>> syntax (Solaris'es / Oracles "x86 Assembly Language Reference
>>>>>>>> Manual", operand size is supposed to default to "long".
>>>>>>>>
>>>>>>>> However, of these two
>>>>>>>>
>>>>>>>>      add     $1, (%eax)
>>>>>>>>      add     $0x1234, (%eax)
>>>>>>>>
>>>>>>>> the first indeed defaults to "long" (except in 16-bit mode, but I think
>>>>>>>> that's fine despite what that doc says) while the second causes an
>>>>>>>> error. That's because of
>>>>>>>>
>>>>>>>>        if (i.tm.opcode_modifier.w)
>>>>>>>>          {
>>>>>>>>            as_bad (_("no instruction mnemonic suffix given and "
>>>>>>>>                      "no register operands; can't size instruction"));
>>>>>>>>            return 0;
>>>>>>>>          }
>>>>>>>>
>>>>>>>> in process_suffix(): The pattern for the 8-bit sign extended
>>>>>>>> immediate does no have W set, while most other instructions
>>>>>>>> allowing for no register operands at all have it set. I'm of the
>>>>>>>> strong opinion that the behavior of the assembler should at least
>>>>>>>> be consistent, i.e. in particular it should not depend on the value
>>>>>>>> of an immediate.
>>>>>>>>
>>>>>>>> Which way to make it consistent, though, I'm not sure about:
>>>>>>>> It could be made match Intel syntax behavior, where an error is
>>>>>>>> being flagged whenever multiple operand sizes are permitted for
>>>>>>>> a mnemonic (that's imo the model most helpful to the programmer),
>>>>>>>> or it could be made match that doc by simply removing the as_bad()
>>>>>>>> invocation above (which is the model accepting the widest set of
>>>>>>>> originally non-gas sources). Of course it would be possible to have
>>>>>>>> the user select between the two by command line option and/or
>>>>>>>> directive, but even then we would need to settle on what default
>>>>>>>> behavior should be.
>>>>>>>
>>>>>>> I agreed that AT&T syntax is poorly documented.   As for this specific
>>>>>>> case, I am OK with either option as long as it doesn't break existing
>>>>>>> codes.
>>>>
>>>> So there's one first fundamental roadblock here: Quite a few FPU
>>>> insns specify Dword|Qword|Unspecified. It seems rather obvious to me
>>>> that this can't be right (Unspecified should only be used when only a
>>>> single operand size is permitted), but even the disassembler doesn't
>>>> add an 's' suffix for at least the integer form ones. Would you agree
>>>> with fixing the disassembler independently of that other work?
>>>>
>>>
>>> Will this require change assembler testcase inputs?  Will the disassmebler
>>> output match the input?
>>
>> Well, I'd do the change in steps (in part depending on your feedback):
>> As a first step I'd like to fix the disassembler output, perhaps without
>> touching the input files (but bringing them out of sync, yet as there
>> are numerous examples where input and output don't fully match, I
> 
> By "match", I mean I can tell assembler output is correct.  It is OK that
> the disassembler output isn't exactly the same assembler input.

Oh, perhaps you've assigned stronger meaning to "exact" that I
had meant: Of course there can be cosmetic differences. What I
mean here (and what is the case elsewhere) are differences in
suffix between input and disassembler output.

>> would have thought this is no problem; let me know your preference).
>> As a second step I'd like to change all test cases (inputs, and where
>> necessary outputs) where we wrongly rely on ambiguous behavior.
> 
> Sure.
> 
>> The 3rd and final step then would be to actually change the assembler.
>>
>> As you want me to have checked a Linux build against a such
>> changed assembler (and I'd imply a gcc bootstrap would be helpful to
>> do too, plus perhaps also using that resulting gcc for said Linux build),
>> this last step will take me a while. I would nevertheless hope that you
>> could agree with doing the first two steps earlier.
> 
> Such assembler change must pass Linux kernel build for both i386 and x86-64
> as well as GCC and glibc build/test.

As said, I did imply gcc and Linux. I don't think I'm in the position to
try glibc - I've never built that, and I wasn't planning to. You also
didn't make this a requirement back when I had first started this
thread.

Jan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]