This is the mail archive of the
mailing list for the binutils project.
Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
- From: Rich Felker <dalias at aerifal dot cx>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Richard Biener <richard dot guenther at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>, GCC Development <gcc at gcc dot gnu dot org>, Binutils <binutils at sourceware dot org>, "Girkar, Milind" <milind dot girkar at intel dot com>, "Kreitzer, David L" <david dot l dot kreitzer at intel dot com>
- Date: Sat, 27 Jul 2013 12:12:57 -0400
- Subject: Re: [x86-64 psABI]: Extend x86-64 psABI to support AVX-512
- References: <CAMe9rOrvMxSLj3LcYBs71tVdw6C0vJFKD2HxvnoHc13UamftwA at mail dot gmail dot com> <ddab98c2-bb3b-4d02-b403-e7d5690cfe00 at email dot android dot com> <CAMe9rOpxErCVtE-PDZ3Yb9mL+4E+XQ-are9Df4YBbEioj+MmZA at mail dot gmail dot com> <b9c5d467-834a-4b57-b48c-ac4bb450c9e5 at email dot android dot com> <20130725030655 dot GL14138 at laptop dot redhat dot com> <20130725065538 dot GA18427 at domone dot kolej dot mff dot cuni dot cz> <20130725165053 dot GJ4284 at brightrain dot aerifal dot cx> <20130727154405 dot GA25725 at domone dot kolej dot mff dot cuni dot cz>
On Sat, Jul 27, 2013 at 05:44:05PM +0200, OndÅej BÃlka wrote:
> On Thu, Jul 25, 2013 at 12:50:53PM -0400, Rich Felker wrote:
> > On Thu, Jul 25, 2013 at 08:55:38AM +0200, OndÅej BÃlka wrote:
> > > On Thu, Jul 25, 2013 at 05:06:55AM +0200, Jakub Jelinek wrote:
> > > > On Wed, Jul 24, 2013 at 07:36:31PM +0200, Richard Biener wrote:
> > > > > >Make them callee saved means we need to change ld.so to
> > > > > >preserve them and we need to change unwind library to
> > > > > >support them. It is certainly doable.
> > > > >
> > > > > IMHO it was a mistake to not have any callee saved xmm register in the
> > > > > original abi - we should fix this at this opportunity. Loops with
> > > > > function calls are not that uncommon.
> > > >
> > > > I've raised that earlier already. One issue with that beyond having to
> > > > teach unwinders about this (dynamic linker if you mean only for the lazy PLT
> > > > resolving is only a matter of whether the dynamic linker itself has been
> > > > built with a compiler that would clobber those registers anywhere) is that
> > > > as history shows, the vector registers keep growing over time.
> > > > So if we reserve now either 8 or all 16 zmm16 to zmm31 registers as call
> > > > saved, do we save them as 512 bit registers, or say 1024 bit already?
> > >
> > > We shouldn't save them all as we would often need to unnecessarily save
> > > register in leaf function. I am fine with 8. In practice 4 should be
> > > enough for most use cases.
> > You can't add call-saved registers without breaking the ABI, because
> > they need to be saved in the jmp_buf, which does not have space for
> > them.
> Well you can. Use versioning, structure will not change and layout for
> old setjmp/longjmp is unchanged. For new setjmp we set jump address to
> jmp_buf address to distinguish it from first case. Then for each thread
> we keep a stack with extra space needed to save additional registers.
> When setjmp/longjmp is called we prune frames from exited functions.
This required unbounded storage which does not exist. From a practical
standpoint you would either have to reserve a huge amount of storage
(e.g. double the allocated thread stack size and use half of it as
reserved space for jmp_buf) or make the calling program crash when the
small, reasonable amount of reserved space is exhausted. The latter is
highly unacceptable since the main purpose (IMO:) of jmp_buf is to
work around bad library code that can't handle resource exhaustion by
replacing its 'xmalloc' type functions with ones that longjmp to a
thread-local jmp_buf set by the caller (e.g. this is the only way to
use glib robustly).
By the way, I do have another horrible idea for how you could do it.
glibc's jmp_buf is actually a sigjmp_buf and contains 120 wasted bytes
of sigset_t for nonexistant HURD signals. So you could store a few
registers after the actually-used part of the sigset_t.
> > Also, unless you add them at the same time the registers are added to
> > the machine (so there's no existing code using those registers),
> > you'll have ABI problems like this: function using the new call-saved
> > registers calls qsort, which calls application code, which assumes the
> > registers are call-clobbered and clobbers them; after qsort returns,
> > the original caller's state is gone.
> What are you talking about? Do you mean that user wrongly marked qsort
> as a function that does not clobber arguments?
OK, you're obviously thinking of some kind of special way of tagging
individual functions as preserving new registers, rather than whole
object or shared library files, in which case it's plausible that you
can make this part work.