This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: New optimized string routines for Intel and alignment of stack.
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: "Carlos O'Donell" <carlos at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 7 Jun 2016 06:18:19 -0700
- Subject: Re: New optimized string routines for Intel and alignment of stack.
- Authentication-results: sourceware.org; auth=none
- References: <57566200 dot 2040203 at redhat dot com> <dea8c68f-cc02-9427-4e54-acd795a930cf at redhat dot com>
On Tue, Jun 7, 2016 at 2:52 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 06/07/2016 07:56 AM, Carlos O'Donell wrote:
>>
>> H.J.,
>>
>> We have had several users that have built legacy applications
>> for 32-bit x86 with stack alignment that does not match the
>> ABI.
>
>
> Let's say the GNU project broke the i386 ABI, which is more accurate. The
> stack pointer alignment requirement is a recent change.
The psABI change happened when SSE supported was added to GCC
almost 20 years ago. The i386 psABI published last year made
GCC change official :-(.
>> In all of these cases it has to do with the application
>> having been compiled with -falign-stack=assume-4-byte which
>> violates the ABI, usually with icc. However, if you're careful
>> it all just works.
>
>
> It will get worse with increased vectorization and GCC 6. We already saw
> this on x86_64 with the non-compliant malloc in tcsh, where GCC 6 used
> vector instructions to copy a struct dirstream object. I assume this could
> easily happen with any stack-to-stack copy with SSE2 enabled.
>
> Currently, GCC does not seem to exploit the fact that it knows the alignment
> of stack objects. I played with this:
>
> struct fields
> {
> double a, b;
> };
>
> struct fields get (void);
> void put (struct fields *, struct fields *);
>
> void
> copy (void)
> {
> struct fields f1 = get ();
> struct fields f2 = f1;
> put (&f1, &f2);
> }
>
> And: gcc -m32 -O3 -msse2 -march=westmere -mtune=westmere -o- -S
> stack-align.c
>
> I expected to see an SSE load/store for the copy, but that's not what I got.
>
> I think we need to decide if we want to roll back the ABI change before GCC
> learns about this optimization because eventually, it will not just be a
> matter of string routines. Any glibc code optimized for 32-bit x86 CPUs
> with SSE2 enabled could be affected.
>
We can compile i686 glibc with -mstackrealign:
'-mstackrealign'
Realign the stack at entry. On the x86, the '-mstackrealign'
option generates an alternate prologue and epilogue that realigns
the run-time stack if necessary. This supports mixing legacy codes
that keep 4-byte stack alignment with modern codes that keep
16-byte stack alignment for SSE compatibility. See also the
attribute 'force_align_arg_pointer', applicable to individual
functions.
GCC 4.4 or above should generate very decent codes. But other
libraries still require 16-byte stack alignment.
--
H.J.