This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x86: AT&T syntax operand size defaults


On Wed, Nov 15, 2017 at 6:03 AM, Jan Beulich <JBeulich@suse.com> wrote:
> (re-adding list to Cc)
>
>>>> On 15.11.17 at 14:30, <hjl.tools@gmail.com> wrote:
>> On Wed, Nov 15, 2017 at 5:05 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 15.11.17 at 13:46, <hjl.tools@gmail.com> wrote:
>>>> On Wed, Nov 15, 2017 at 2:32 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 16.10.17 at 13:24, <hjl.tools@gmail.com> wrote:
>>>>>> On Mon, Oct 16, 2017 at 3:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>>> On 13.10.17 at 23:51, <hjl.tools@gmail.com> wrote:
>>>>>>>> On 10/13/17, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>> All,
>>>>>>>>>
>>>>>>>>> according to the only reasonable document about AT&T assembler
>>>>>>>>> syntax (Solaris'es / Oracles "x86 Assembly Language Reference
>>>>>>>>> Manual", operand size is supposed to default to "long".
>>>>>>>>>
>>>>>>>>> However, of these two
>>>>>>>>>
>>>>>>>>>      add     $1, (%eax)
>>>>>>>>>      add     $0x1234, (%eax)
>>>>>>>>>
>>>>>>>>> the first indeed defaults to "long" (except in 16-bit mode, but I think
>>>>>>>>> that's fine despite what that doc says) while the second causes an
>>>>>>>>> error. That's because of
>>>>>>>>>
>>>>>>>>>        if (i.tm.opcode_modifier.w)
>>>>>>>>>          {
>>>>>>>>>            as_bad (_("no instruction mnemonic suffix given and "
>>>>>>>>>                      "no register operands; can't size instruction"));
>>>>>>>>>            return 0;
>>>>>>>>>          }
>>>>>>>>>
>>>>>>>>> in process_suffix(): The pattern for the 8-bit sign extended
>>>>>>>>> immediate does no have W set, while most other instructions
>>>>>>>>> allowing for no register operands at all have it set. I'm of the
>>>>>>>>> strong opinion that the behavior of the assembler should at least
>>>>>>>>> be consistent, i.e. in particular it should not depend on the value
>>>>>>>>> of an immediate.
>>>>>>>>>
>>>>>>>>> Which way to make it consistent, though, I'm not sure about:
>>>>>>>>> It could be made match Intel syntax behavior, where an error is
>>>>>>>>> being flagged whenever multiple operand sizes are permitted for
>>>>>>>>> a mnemonic (that's imo the model most helpful to the programmer),
>>>>>>>>> or it could be made match that doc by simply removing the as_bad()
>>>>>>>>> invocation above (which is the model accepting the widest set of
>>>>>>>>> originally non-gas sources). Of course it would be possible to have
>>>>>>>>> the user select between the two by command line option and/or
>>>>>>>>> directive, but even then we would need to settle on what default
>>>>>>>>> behavior should be.
>>>>>>>>
>>>>>>>> I agreed that AT&T syntax is poorly documented.   As for this specific
>>>>>>>> case, I am OK with either option as long as it doesn't break existing
>>>>>>>> codes.
>>>>>
>>>>> So there's one first fundamental roadblock here: Quite a few FPU
>>>>> insns specify Dword|Qword|Unspecified. It seems rather obvious to me
>>>>> that this can't be right (Unspecified should only be used when only a
>>>>> single operand size is permitted), but even the disassembler doesn't
>>>>> add an 's' suffix for at least the integer form ones. Would you agree
>>>>> with fixing the disassembler independently of that other work?
>>>>>
>>>>
>>>> Will this require change assembler testcase inputs?  Will the disassmebler
>>>> output match the input?
>>>
>>> Well, I'd do the change in steps (in part depending on your feedback):
>>> As a first step I'd like to fix the disassembler output, perhaps without
>>> touching the input files (but bringing them out of sync, yet as there
>>> are numerous examples where input and output don't fully match, I
>>
>> By "match", I mean I can tell assembler output is correct.  It is OK that
>> the disassembler output isn't exactly the same assembler input.
>
> Oh, perhaps you've assigned stronger meaning to "exact" that I
> had meant: Of course there can be cosmetic differences. What I
> mean here (and what is the case elsewhere) are differences in
> suffix between input and disassembler output.
>
>>> would have thought this is no problem; let me know your preference).
>>> As a second step I'd like to change all test cases (inputs, and where
>>> necessary outputs) where we wrongly rely on ambiguous behavior.
>>
>> Sure.
>>
>>> The 3rd and final step then would be to actually change the assembler.
>>>
>>> As you want me to have checked a Linux build against a such
>>> changed assembler (and I'd imply a gcc bootstrap would be helpful to
>>> do too, plus perhaps also using that resulting gcc for said Linux build),
>>> this last step will take me a while. I would nevertheless hope that you
>>> could agree with doing the first two steps earlier.
>>
>> Such assembler change must pass Linux kernel build for both i386 and x86-64
>> as well as GCC and glibc build/test.
>
> As said, I did imply gcc and Linux. I don't think I'm in the position to
> try glibc - I've never built that, and I wasn't planning to. You also
> didn't make this a requirement back when I had first started this
> thread.

You check kernel and GCC.  I will check glibc after your patch is merged.

Thanks.


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]