This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] posix_spawn: use a larger min stack for -fstack-check [BZ #21253]


On 03/20/2017 10:04 AM, Mike Frysinger wrote:
On 20 Mar 2017 09:56, Florian Weimer wrote:
On 03/16/2017 10:52 PM, Mike Frysinger wrote:
i picked 32 KiB as a trade off of it being more than big enough and
being a nice multiple of 4, 8, and 16, while not overtly allocating
too much otherwise unused.

Depending on what constant page size GCC uses, this will not fix the
-fstack-check bug on all architectures.

do you know of one today ?  i grepped through the latest gcc source to
come up with the numbers that i used, and i think 32 KiB should cover
them all.

GCC stack probing assumes a 4 KiB stack size, which should be a conservative assumption almost everywhere. But the first probe is off by a few bytes. This seems to be a common issue across many target, e.g. on GCC 6 for aarch64:

void external(void *);

void
int1 (void)
{
  int a;
  external (&a);
}


        .text
        .align  2
        .p2align 3,,7
        .global int1
        .type   int1, %function
int1:
.LFB0:
        .cfi_startproc
        sub     x9, sp, #8192
        str     xzr, [x9, 4064]
        stp     x29, x30, [sp, -32]!
        .cfi_def_cfa_offset 32
        .cfi_offset 29, -32
        .cfi_offset 30, -24
        add     x29, sp, 0
        .cfi_def_cfa_register 29
        add     x0, x29, 28
        bl      external
        ldp     x29, x30, [sp], 32
        .cfi_restore 30
        .cfi_restore 29
        .cfi_def_cfa 31, 0
        ret
        .cfi_endproc
.LFE0:
        .size   int1, .-int1

Or ppc64le:

        .abiversion 2
        .section        ".text"
        .align 2
        .p2align 4,,15
        .globl int1
        .type   int1, @function
int1:
.LCF0:
0:      addis 2,12,.TOC.-.LCF0@ha
        addi 2,2,.TOC.-.LCF0@l
        .localentry     int1,.-int1
        mflr 0
        std 0,-16432(1)
        std 0,16(1)
        stdu 1,-48(1)
        addi 3,1,32
        bl external
        nop
        addi 1,1,48
        ld 0,16(1)
        mtlr 0
        blr
        .long 0
        .byte 0,0,0,1,128,0,0,0
        .size   int1,.-int1


Or x86_64:

        .text
        .p2align 4,,15
        .globl  int1
        .type   int1, @function
int1:
.LFB0:
        .cfi_startproc
        subq    $4152, %rsp
        orq     $0, (%rsp)
        addq    $4128, %rsp
        .cfi_def_cfa_offset 32
        leaq    12(%rsp), %rdi
        call    external
        addq    $24, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE0:
        .size   int1, .-int1
        .p2align 4,,15

I know x86-64 best, so I'm going to explain the issue there. Suppose that at the time of the call instruction, %rsp (the stack pointer) ends with …3000. Then on function entry, %rsp == …3008. Decrementing %rsp by 4152 results in …1FD0. The 8-byte stack probe touches the bytes from …1FD0 to …1FD7. This skips over the guard page at …2000 (which extends to …2FFF).

The other problem is that GCC does not take the implied stack probe by the call instruction into account, or other forms of linear stack access. This leads to grossly inefficient code with -fstack-check.

I actually think that compiling with -fstack-check is a good idea, but GCC really needs fixes first. We do not know the root cause of these issues (note the 3 * 4096 probe offset on POWER), and just throwing random workarounds into glibc does not seem right.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]