This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>, GCC Development <gcc at gcc dot gnu dot org>, Binutils <binutils at sourceware dot org>, "Girkar, Milind" <milind dot girkar at intel dot com>, "Kreitzer, David L" <david dot l dot kreitzer at intel dot com>
- Date: Wed, 24 Jul 2013 20:52:33 +0200
- Subject: Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
- References: <CAMe9rOrvMxSLj3LcYBs71tVdw6C0vJFKD2HxvnoHc13UamftwA at mail dot gmail dot com> <ddab98c2-bb3b-4d02-b403-e7d5690cfe00 at email dot android dot com> <51F01C0A dot 5050101 at twiddle dot net>
On Wed, Jul 24, 2013 at 08:25:14AM -1000, Richard Henderson wrote:
> On 07/24/2013 05:23 AM, Richard Biener wrote:
> > "H.J. Lu" <hjl.tools@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> Here is a patch to extend x86-64 psABI to support AVX-512:
> >
> > Afaik avx 512 doubles the amount of xmm registers. Can we get them callee saved please?
>
> Having them callee saved pre-supposes that one knows the width of the register.
>
> There's room in the instruction set for avx1024. Does anyone believe that is
> not going to appear in the next few years?
>
It would be mistake for intel to focus on avx1024. You hit diminishing
returns and only few workloads would utilize loading 128 bytes at once.
Problem with vectorization is that it becomes memory bound so you will
not got much because performance is dominated by cache throughput.
You would get bigger speedup from more effective pipelining, more
fusion...