This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] nptl: Fix racy pipe closing in tst-cancel{20,21}
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 19 Oct 2015 17:01:53 -0200
- Subject: Re: [PATCH] nptl: Fix racy pipe closing in tst-cancel{20,21}
- Authentication-results: sourceware.org; auth=none
- References: <1444849395-18800-1-git-send-email-adhemerval dot zanella at linaro dot com> <20151014195832 dot GY8645 at brightrain dot aerifal dot cx>
Ping.
On 14-10-2015 16:58, Rich Felker wrote:
> On Wed, Oct 14, 2015 at 04:03:15PM -0300, Adhemerval Zanella wrote:
>> The tst-cancel20 open two pipes and creates a thread which blocks
>> reading the first pipe. It then issues a signal to activate an
>> handler which also blocks reading the second pipe. Finally the
>> cancellation cleanup-up handlers are tested by first closing the
>> all the pipe ends and issuing a pthread_cancel. The tst-cancel21
>> have a similar behavior, but use an extra fork after the test itself.
>>
>> The race condition occurs if the cancellation handling acts after the
>> pipe close: in this case read will return EOF (indicating side-effects)
>> and thus the cancellation must not act. However current GLIBC
>> cancellation behavior acts regardless the syscalls returns with
>> sid-effects.
>>
>> This patch adjust the test by moving the pipe closing after the
>> cancellation handling. This avoid spurious cancellation for the
>> case described.
>>
>> Checked on x86_64 and i386.
>
> I was involved in the discussion of this and believe that the fix is
> correct. The only reason the tests "worked" before was that
> cancellation was wrongly being acted upon after read succeeded in
> reading EOF.
>
> Note that, with this change, the tests will now timeout if read fails
> to act on cancellation, rather than exiting with a reportable error.
> This could be fixed with some very complicated machinery involving an
> additional signal handler and AS-safe synchronization mechanisms to
> control the ordering of close with respect to interruption of read,
> but as long as timeout is an acceptable way of detecting test failure,
> I see no reason to complicate the test logic like that.
>
> Rich
>