This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: Question about tst-stack4
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, libc-help at sourceware dot org
- Cc: nd at arm dot com
- Date: Wed, 16 Aug 2017 16:36:15 +0100
- Subject: Re: Question about tst-stack4
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Szabolcs dot Nagy at arm dot com;
- Nodisclaimer: True
- References: <1502831028.3962.208.camel@cavium.com> <53bf7be9-b339-fc62-7585-76312a5e5bad@linaro.org>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On 16/08/17 14:08, Adhemerval Zanella wrote:
> On 15/08/2017 18:03, Steve Ellcey wrote:
>> I have a question about nptl/tst-stack4. In the 2.25 release notes,
>> this shows up as a failure on a number of platforms. One references a
>> defect report (https://sourceware.org/bugzilla/show_bug.cgi?id=19329)
>> and the others all reference a glibc checkin that caused the failure:
>>
>> commit 17af5da98cd2c9ec958421ae2108f877e0945451
>> Author: Alexandre Oliva <aoliva@redhat.com>
>> Date: Wed Sep 21 22:01:16 2016 -0300
>>
>> [PR19826] fix non-LE TLS in static programs
>>
this introduced a regression that got fixed in
commit d675eaf7d99096a952c1d140abfed82c939fb259
Author: Alexandre Oliva <aoliva@redhat.com>
AuthorDate: 2017-02-03 20:35:16 -0500
Commit: Carlos O'Donell <carlos@systemhalted.org>
CommitDate: 2017-02-03 21:34:14 -0500
Bug 20915: Do not initialize DTV of other threads.
so that should not be an issue any more.
(but it might have made a data race easier to trigger)
>>
>> In the 2.26 release notes the platforms where it fails all just
>> mention a timeout.
>>
>> Currently, with top-of-tree sources this test fails for me on
>> aarch64 with the new ILP32 ABI but passes with the LP64 ABI.
>> The failure I see in nptl/tst-stack4.out is:
>>
>> Didn't expect signal from child: got `Segmentation fault'
>>
>> Is this the error that the other failing platforms are seeing?
>> (ARM, Microblaze, MIPS, Nios II, PowerPC [32 bit soft-float])
>>
>> Are there any other platforms that are getting a Segfault?
>> Given how often this shows up in other platforms failing list,
>> I am not sure if there is a aarch64 ILP32 specific issue here
>> or a more generic bug that has been around for a while.
>
> My understanding is this is a long standing GLIBC issue that
> started with BZ#17918 and now is being tracked by BZ#19329
> as you indicated. And since it is a race issue the failure
> is not consistent and rely on same factors (load, memory
> consistency, etc).
>
> I have seen this on ppc64le, powerpc32, and on i686 as well
> (on some heavy loaded machine). I also think that this is a not
> a ILP32 specific issue and although I do not recall failing on
> AArch64 LP64 I think it can potentially fail there as well.
>
> Szabolcs Nagy may have more information about current issue
> status since he was the one trying to fix it.
>
yes aarch64 lp64 sometimes fails too.
this test may hit a number of race conditions
between dlopen and pthread_create, on lp64 i see
Inconsistency detected by ld.so: dl-tls.c: 488: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <=
GL(dl_tls_generation)' failed!
i think segfault is also possible on accessing
listp->next, but if it's consistently a segfault
that may be a different issue.
i had a test that exercises the race more,
you may try to run that or something similar
to see if it fails the same way:
https://sourceware.org/ml/libc-alpha/2016-12/msg00456.html