This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] posix_spawn: use a larger min stack for -fstack-check [BZ #21253]
On 20/03/2017 10:26, Florian Weimer wrote:
> On 03/20/2017 10:04 AM, Mike Frysinger wrote:
>> On 20 Mar 2017 09:56, Florian Weimer wrote:
>>> On 03/16/2017 10:52 PM, Mike Frysinger wrote:
>>>> i picked 32 KiB as a trade off of it being more than big enough and
>>>> being a nice multiple of 4, 8, and 16, while not overtly allocating
>>>> too much otherwise unused.
>>>
>>> Depending on what constant page size GCC uses, this will not fix the
>>> -fstack-check bug on all architectures.
>>
>> do you know of one today ? i grepped through the latest gcc source to
>> come up with the numbers that i used, and i think 32 KiB should cover
>> them all.
>
> GCC stack probing assumes a 4 KiB stack size, which should be a conservative assumption almost everywhere. But the first probe is off by a few bytes. This seems to be a common issue across many target, e.g. on GCC 6 for aarch64:
>
> void external(void *);
>
> void
> int1 (void)
> {
> int a;
> external (&a);
> }
>
>
> .text
> .align 2
> .p2align 3,,7
> .global int1
> .type int1, %function
> int1:
> .LFB0:
> .cfi_startproc
> sub x9, sp, #8192
> str xzr, [x9, 4064]
> stp x29, x30, [sp, -32]!
> .cfi_def_cfa_offset 32
> .cfi_offset 29, -32
> .cfi_offset 30, -24
> add x29, sp, 0
> .cfi_def_cfa_register 29
> add x0, x29, 28
> bl external
> ldp x29, x30, [sp], 32
> .cfi_restore 30
> .cfi_restore 29
> .cfi_def_cfa 31, 0
> ret
> .cfi_endproc
> .LFE0:
> .size int1, .-int1
>
> Or ppc64le:
>
> .abiversion 2
> .section ".text"
> .align 2
> .p2align 4,,15
> .globl int1
> .type int1, @function
> int1:
> .LCF0:
> 0: addis 2,12,.TOC.-.LCF0@ha
> addi 2,2,.TOC.-.LCF0@l
> .localentry int1,.-int1
> mflr 0
> std 0,-16432(1)
> std 0,16(1)
> stdu 1,-48(1)
> addi 3,1,32
> bl external
> nop
> addi 1,1,48
> ld 0,16(1)
> mtlr 0
> blr
> .long 0
> .byte 0,0,0,1,128,0,0,0
> .size int1,.-int1
>
>
> Or x86_64:
>
> .text
> .p2align 4,,15
> .globl int1
> .type int1, @function
> int1:
> .LFB0:
> .cfi_startproc
> subq $4152, %rsp
> orq $0, (%rsp)
> addq $4128, %rsp
> .cfi_def_cfa_offset 32
> leaq 12(%rsp), %rdi
> call external
> addq $24, %rsp
> .cfi_def_cfa_offset 8
> ret
> .cfi_endproc
> .LFE0:
> .size int1, .-int1
> .p2align 4,,15
>
> I know x86-64 best, so I'm going to explain the issue there. Suppose that at the time of the call instruction, %rsp (the stack pointer) ends with …3000. Then on function entry, %rsp == …3008. Decrementing %rsp by 4152 results in …1FD0. The 8-byte stack probe touches the bytes from …1FD0 to …1FD7. This skips over the guard page at …2000 (which extends to …2FFF).
>
> The other problem is that GCC does not take the implied stack probe by the call instruction into account, or other forms of linear stack access. This leads to grossly inefficient code with -fstack-check.
>
> I actually think that compiling with -fstack-check is a good idea, but GCC really needs fixes first. We do not know the root cause of these issues (note the 3 * 4096 probe offset on POWER), and just throwing random workarounds into glibc does not seem right.
>
> Thanks,
> Florian
Should also -fstack-check check for probe value underflow/overflow? The cases should be
rare (if valid), but at least gcc has one bug report about it [1].
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66479