This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] pthread_once hangs when init routine throws an exception [BZ #18435]


On 07/08/2015 07:00 AM, Szabolcs Nagy wrote:
> On 06/07/15 18:08, Szabolcs Nagy wrote:
>> On 06/07/15 17:33, Szabolcs Nagy wrote:
>>> On 06/07/15 15:58, Adhemerval Zanella wrote:
>>>> On 06-07-2015 11:16, Martin Sebor wrote:
>>>>>> this broke
>>>>>>
>>>>>> nptl/tst-join5
>>>>>> nptl/tst-once3
>>>>>>
>>>>>> tests on aarch64.
>>>>>>
>>>>>> the cleanup handler of the pthread_once and pthread_join
>>>>>> implementation don't run when they are canceled.
>>>>>
>>>>> I'll look into it as soon as I get access to an aarch64 machine.
>>>>>
>>>>> Martin
>>>>>
>>>>
>>>> And I see a regression with
>>>>
>>>> nptl/tst-once3
>>>>
>>>> for armhf.
>>>>
>>>
>>> in case of aarch64 the bug is somewhere in __pthread_unwind
>>> (called from __do_cancel) so probably a libgcc issue.
>>>
>>
>> the problem seems to be that gcc on x86_64 turns on
>> -fasynchronous-unwind-tables by default, but not on
>> aarch64 or arm.
>>
>> now i added -fasynchronous-unwind-tables to the cflags
>> of the relevant tests, will send a patch if they pass.
>>
> 
> This uncovered a serious issue that affects other archs too.

Thanks.

> Both test failures are caused by glibc switching the internal
> mechanism of pthread cancellation clean up handling to use
> __attribute__((cleanup(f))) and -fexceptions, but the two test
> failures are independent:
> 
> (1) Should -fasynchronous-unwind-tables be on by default in gcc?
> 
> nptl/tst-once3 fails because the callback passed to pthread_once
> now has to be compiled with -fasynchronous-unwind-tables which
> is not on by default on arm and aarch64 gcc.  So does glibc
> expect the users to use this flag correctly or does glibc
> requires the compiler to have it on by default?

This is bad.

> (My understanding: posix conforming c code cannot observe the
> presence of -fasynchronous-unwind-tables without invoking UB, but
> the glibc implementation of cancellation cleanup and backtrace
> from signal handlers makes this detail observable.  Any function
> which may be canceled needs this flag to make cleanup work, so
> glibc seems to impose this as a requirement on the compiler: the
> user may not be in control of all the code that may be canceled).
 
We already impose the requirement that all such called code be
cancel safe anyway and it might not be unless all called code
uses cancel handlers to cleanup during cancellation. This would
be another requirement that imposes -fasynchronous-unwind-tables
on cancellation users. However, this is a new requirement and
old code can't be fixed, and thus we have problem that requires
versioning and documentation. All for the purposes of implementing
C++ std::call_once via pthread_once, which seems like is going
to be problematic.
 
> (2) Should gcc support exceptions from async signal handlers?

No. I don't think you can support it safely.

> nptl/tst-join5 failure is more problematic: it fails because gcc
> does not seem to implement -fexceptions with the assumption that
> signal handlers can throw, in particular it assumes inline asm
> does not throw exceptions.  If the syscall that is a cancellation
> point appears between pthread_cleanup_push and pthread_cleanup_pop
> in glibc internal code, the cleanup handler may not get run on
> cancellation depending on where gcc moved the syscall inline asm.
> (It is free to move it outside the code range that is marked for
> exception handling, this is what happens on aarch64 in pthread_join).
> This affects all archs, but some may get lucky.

Ah! That's truly a terrible scenario.

> (My understanding: gcc must be very strict about how it marks
> the code range for exception handling and assume any instruction
> may throw if it wants -fexceptions -fasynchronous-unwind-tables to
> work from signal handlers.  Current compilers do not seem to support
> this so glibc internal code should not rely on it, which means the
> cancellation mechanism should not rely on exception handling at
> least not when the exception is thrown from the cancel signal
> handler.  I think the gnu toolchain should not try to make pthread
> cancellation to interoperate with C++ exceptions nor to make
> exceptions work from signal handlers: no standard requires this
> behaviour and seems to cause problems).

No, we just need to revert this patch and have C++ implement
std::call_once by itself.
 
> Both issues cause silent omission of cleanup handlers running
> on cancellation, leaving libc internal state inconsistent.
> 
> The second issue may be worth discussing on the gcc list.
> 

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]