This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
- From: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 10 Sep 2014 19:11:27 -0300
- Subject: Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
- Authentication-results: sourceware.org; auth=none
- References: <5410C70E dot 70207 at linux dot vnet dot ibm dot com> <20140910220042 dot GL23797 at brightrain dot aerifal dot cx>
On 10-09-2014 19:00, Rich Felker wrote:
> On Wed, Sep 10, 2014 at 06:47:58PM -0300, Adhemerval Zanella wrote:
>> Hi all,
>>
>> I have summarized in [1] the current issues with GLIBC pthread cancellation system,
>> the current GLIBC implementation and the proposed solution by Rich Felker with the
>> adjustment required to enabled it on GLIBC.
>>
>> It is still heavily WIP and I'm still planning to add more content, so any question,
>> comments, advices are welcomed.
>>
>> The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
>> rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
>> but initial results are promising. It is building on both powerpc64 and x86_64
>> (it won't build on others platforms basically because I rewrite the way cancelable
>> syscalls are done).
>>
>> Current NPTL testcase are all passing but:
>>
>> FAIL: nptl/tst-cancel-wrappers
>> FAIL: nptl/tst-cancel20
>> FAIL: nptl/tst-cancel21-static
>> FAIL: nptl/tst-cancel4
>> FAIL: nptl/tst-cancel5
>> FAIL: nptl/tst-cancelx20
>> FAIL: nptl/tst-cancelx21
>> FAIL: nptl/tst-cancelx4
>> FAIL: nptl/tst-cancelx5
>> FAIL: nptl/tst-detach1
>>
>> The 'nptl/tst-cancel-wrappers' is expected since I get rid of the
>> enable_asynccancel/disable_asynccancel function, but the other are due the fact now
>> cancellation *will not* on one important case:
>>
>> * syscall is blocked but with some side effects already having taken place (for
>> instance partial read/write/send/etc.)
> It's important that cancellation NOT be acted upon in these cases. The
> side effects for them are not equivalent to EINTR (EINTR is only
> allowed when no data was transferred) and thus acting on cancellation
> would violate the rule that the side effects on cancellation must be
> as if the call terminated with EINTR.
>
> It is desirable that the partial read/write immediately return in this
> case, rather than sitting around waiting for more data to be
> transferred, and unless you go out of your way to get a different
> behavior, that should come for free with most natural implementations
> anyway. Then cancellation of course remains pending and will be acted
> upon as soon as a cancellation point is called again. The important
> thing is that the application has now had the ability to record what
> side effects were completed, and which ones remain incomplete, so that
> it has a consistent state when cancellation is acted upon.
I do agree that cancellation should not act upon the cases described and my
idea is in fact adjust testcase to check for partial read and call
pthread_testcancel to check for pending cancellations.
However this change current GLIBC expected behavior (which I do think is not
correct regarding the issues described), so I would like to know if maintainer
seems reasonable to change its behavior.
>
>> This is the cases for tst-cancel[4/5] that checks for cancelable write and send
>> and the way the test is code, kernel IP address from signal handler is *after*
>> syscall, indicating partial read/send. Similar cases occurs for tst-cancel[20|21],
>> where the read returns after the syscall in pipe reading. I'm still checking
>> nptl/tst-detach1.
> Yes, that's exactly how it's supposed to work.
>
> Rich
>