This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH 4/4] x86: fold certain AVX and AVX2 templates
On Fri, Dec 15, 2017 at 8:32 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> 12/15/17 2:10 PM >>>
>>On Fri, Dec 15, 2017 at 2:35 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>> Just like for instructions in GPRs, there's no need to have separate
>>> templates for otherwise identical insns acting on XMM or YMM registers
>>> (or memory of the same size).
>>>
>>> gas/
>>> 2017-12-15 Jan Beulich <jbeulich@suse.com>
>>>
>>> * config/tc-i386.c (regymm, regzmm): Delete.
>>> (operand_type_register_match). Extend comment. Also handle some
>>> memory operands here. Extend to cover .regsimd.
>>> (build_vex_prefix): Derive vector_length from actual operand
>>> size.
>>> (process_operands, build_modrm_byte): Use .regsimd.
>>>
>>> opcodes/
>>> 2017-12-15 Jan Beulich <jbeulich@suse.com>
>>>
>>> * i386-gen.c (operand_type_init): Delete OPERAND_TYPE_REGYMM and
>>> OPERAND_TYPE_REGZMM entries.
>>> * i386-opc.h (enum of opcode modifiers): Extend comment.
>>> i386-opc.tbl (vaddpd, vaddps, vaddsubpd, vaddsubps, vandnpd,
>>> vandnps, vandpd, vandps, vblendpd, vblendps, vblendvpd,
>>> vblendvps, vbroadcastss, vcmpeq_ospd, vcmpeq_osps, vcmpeqpd,
>>> vcmpeqps, vcmpeq_uqpd, vcmpeq_uqps, vcmpeq_uspd, vcmpeq_usps,
>>> vcmpfalse_ospd, vcmpfalse_osps, vcmpfalsepd, vcmpfalseps,
>>> vcmpge_oqpd, vcmpge_oqps, vcmpgepd, vcmpgeps, vcmpgt_oqpd,
>>> vcmpgt_oqps, vcmpgtpd, vcmpgtps, vcmple_oqpd, vcmple_oqps,
>>> vcmplepd, vcmpleps, vcmplt_oqpd, vcmplt_oqps, vcmpltpd,
>>> vcmpltps, vcmpneq_oqpd, vcmpneq_oqps, vcmpneq_ospd,
>>> vcmpneq_osps, vcmpneqpd, vcmpneqps, vcmpneq_uspd, vcmpneq_usps,
>>> vcmpngepd, vcmpngeps, vcmpnge_uqpd, vcmpnge_uqps, vcmpngtpd,
>>> vcmpngtps, vcmpngt_uqpd, vcmpngt_uqps, vcmpnlepd, vcmpnleps,
>>> vcmpnle_uqpd, vcmpnle_uqps, vcmpnltpd, vcmpnltps, vcmpnlt_uqpd,
>>> vcmpnlt_uqps, vcmpordpd, vcmpordps, vcmpord_spd, vcmpord_sps,
>>> vcmppd, vcmpps, vcmptruepd, vcmptrueps, vcmptrue_uspd,
>>> vcmptrue_usps, vcmpunordpd, vcmpunordps, vcmpunord_spd,
>>> vcmpunord_sps, vcvtdq2ps, vcvtpd2dq, vcvtpd2ps, vcvtps2dq,
>>> vcvttpd2dq, vcvttps2dq, vdivpd, vdivps, vdpps, vhaddpd, vhaddps,
>>> vhsubpd, vhsubps, vlddqu, vmaskmovpd, vmaskmovps, vmaxpd,
>>> vmaxps, vminpd, vminps, vmovapd, vmovaps, vmovdqa, vmovdqu,
>>> vmovmskpd, vmovmskps, vmovntdq, vmovntpd, vmovntps, vmovshdup,
>>> vmovsldup, vmovupd, vmovups, vmulpd, vmulps, vorpd, vorps,
>>> vpermilpd, vpermilps, vptest, vrcpps, vroundpd, vroundps,
>>> vrsqrtps, vshufpd, vshufps, vsqrtpd, vsqrtps, vsubpd, vsubps,
>>> vtestpd, vtestps, vunpckhpd, vunpckhps, vunpcklpd, vunpcklps,
>>> vxorpd, vxorps, vpblendd, vpbroadcastb, vpbroadcastd,
>>> vpbroadcastw, vpbroadcastq, vpmaskmovd, vpmaskmovq, vpsllvd,
>>> vpsllvq, vpsravd, vpsravq, vpsrlvd, vpsrlvq): Fold 128- and
>>> 256-bit forms. Use CheckRegSize instead of IgnoreSize where
>>> appropriate. Drop Xmmword and Ymmword from the results where
>>> possible.
>>> * i386-tbl.h: Re-generate.
>>> ---
>>> For some yet to be understood reason folding the memory forms of
>>> vcvtpd2ps doesn't work (some Intel mode ymmword ptr forms produce
>>> 128-bit insns).
>>
>>Integer extension instructions also take 2 register operands of different
>>sizes. How are they handled?
>
> As per the list of changes insns, conversions to/from scalar int aren't being
> folded, so their handling doesn't change. And quite obviously so, since no
> matter what the GPR size, the other side is an xmmword (register or
> memory), while here I'm folding only templates where one used xmmword
> and the other ymmword.
>
>>> Similarly I didn't figure out yet the reason for an anomaly when the
>>> "unspecified" checks in operand_type_register_match() are missing: In
>>> that case I've observed errors on vaddsubp{s,d}, but not on e.g.
>>> vaddp{s,d} with identical operands.
>>
>>Please open a bug with a testcase.
>
> You perhaps misunderstood: I've observed this issue while putting together
> the patch here. I'm not aware of an issue without the patch applied, nor with
> the patch in its current form applied. I'm merely pointing out that there is
> a _possible_ issue pointed out by this anomaly. This could e.g. be the result
> of some latent bug somewhere which was triggered by the not-yet-correct
> patch. I'm intending to investigate this, but I can't predict when this will be;
> I've put the note here in case the observation triggers something for you or
> anyone else who reads this, which might then help me save some time
> needlessly investigating what's going on there.
>
Patch is OK then.
Thanks.
--
H.J.