This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: 2.25 freeze status
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, Florian Weimer <fweimer at redhat dot com>, Phil Blundell <pb at pbcl dot net>, Siddhesh Poyarekar <siddhesh at gotplt dot org>
- Date: Wed, 1 Feb 2017 09:04:05 -0800
- Subject: Re: 2.25 freeze status
- Authentication-results: sourceware.org; auth=none
- References: <c4cfc6e1-ff9f-c8b3-4a56-38f8d484aa05@gotplt.org> <eff6f641-5448-125d-33b0-39ce66c9f8b2@redhat.com> <1485771929.506.15.camel@pbcl.net> <c7862c02-4976-4cc6-a7fa-8ab93b00cfe2@redhat.com> <22163768-023c-1ed5-b258-6e6d14f45e01@redhat.com> <CAMe9rOovpWzNv7TQ3Emj+Ns8hoD9gf8jKNHnStCZfsM=gzrXdw@mail.gmail.com> <eba4d0d3-ead2-c166-af3a-51d3450529d6@redhat.com> <CAMe9rOo7c_G3Gc1jxX_5gUnncQKa4dK1LU4gddGXypCVA+F9MQ@mail.gmail.com> <e76eec7d-d357-bc3c-fcbb-c32abb6b401f@redhat.com> <CAMe9rOrfWP1oDhZ5uRx226jbJsXuXp7qD2844t5da3ub57r=vA@mail.gmail.com> <ea6efe90-2b68-5eef-72f0-0d9a668a3616@redhat.com> <CAMe9rOoYDKNuqLPuJs4ssPbaoU2FKirERvNopJKp9eT-p7_=Yg@mail.gmail.com> <CAMe9rOpHJxopCOUyw22VEZjPvv61Wr+0s02voAa3zf_aSW8oog@mail.gmail.com> <87a50e42-3e86-3f4a-d470-46393e2af199@redhat.com> <CAMe9rOotk606nxvkvYcrF4hkhFTqgaS=Et0kAWkPsK+DiHCWcA@mail.gmail.com> <7c124a2b-a82d-4761-7018-d1a488bc1033@redhat.com>
On Tue, Jan 31, 2017 at 12:45 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 01/31/2017 02:02 PM, H.J. Lu wrote:
>> On Tue, Jan 31, 2017 at 10:59 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> On 01/31/2017 10:57 AM, H.J. Lu wrote:
>>>> On Mon, Jan 30, 2017 at 1:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Mon, Jan 30, 2017 at 12:41 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>>>>>> On 01/30/2017 03:38 PM, H.J. Lu wrote:
>>>>>>> On Mon, Jan 30, 2017 at 12:22 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>>>>>>>> On 01/30/2017 02:39 PM, H.J. Lu wrote:
>>>>>>>>> On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>>>>>>>>> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>>>>>>>>>>> H.J.,
>>>>>>>>>>>>
>>>>>>>>>>>> Could you please back out the fix for bug 20019?
>>>>>>>>>>>>
>>>>>>>>>>>> We will continue to try and fix this in 2.26 with a solution that moves
>>>>>>>>>>>> IFUNC design towards a better documented set of semantics.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Since calling longjmp will segfault in this case, shouldn't it be
>>>>>>>>>>> fixed first by reverting IFUNC implementation in libpthread?
>>>>>>>>>>
>>>>>>>>>> I believe there is insufficient time to test that such a change and verify
>>>>>>>>>> it does not have other unintended consequences for changing a symbol from
>>>>>>>>>> IFUNC to non-IFUNC.
>>>>>>>>>>
>>>>>>>>>> The minimal fix is to revert the changes for bug 20019, and allow programs
>>>>>>>>>> to startup, and run as expected in the cases they do not call longjmp.
>>>>>>>>>>
>>>>>>>>>> I would like to see the minimum amount of reversion required to get us
>>>>>>>>>> back to a state where applications run again.
>>>>>>>>>>
>>>>>>>>>> We have only a few days until the release deadline and I do not wish
>>>>>>>>>> to extend that date.
>>>>>>>>>>
>>>>>>>>>> I understand your desire to fix this correctly, and we will continue this
>>>>>>>>>> discussion once master reopens, possibly with a reversion of the IFUNC
>>>>>>>>>> change to libpthread.
>>>>>>>>>
>>>>>>>>> I don't think knowingly allow a program to segfault at random without any
>>>>>>>>> warning is appropriate. Can't we turn the fatal error into a non-fatal warning?
>>>>>>>>
>>>>>>>> What is or is not appropriate right now must be in the context of the upcoming
>>>>>>>> release.
>>>>>>>>
>>>>>>>> The reversal of the patch is the simplist and most conservative move which
>>>>>>>> restores the behaviour that allows programs to start.
>>>>>>>>
>>>>>>>
>>>>>>> I am not against allowing the bad programs to start. But silently allow the
>>>>>>> bad programs to crash at random isn't a conservative fix to me.
>>>>>>
>>>>>> It isn't a fix at all. We have run out of time to address the issue, and for
>>>>>> the upcoming 2.25 release it would be better that the applications continue
>>>>>> with their existing behaviour, rather than new behaviour that we know we
>>>>>> have to change again.
>>>>>
>>>>> I have a couple questions:
>>>>>
>>>>> 1. Will this change be ever in 2.26?
>>>>> 2. Will this change be backported to 2.24?
>>>>>
>>>>
>>>> I don't think we should change anything for 2.25. We know that when the
>>>> function is called, the program will crash. If programmer is confident that
>>>> the function won't be called, he/she can create a private copy which calls
>>>> abort. The program will start and abort if the function is called.
>>>
>>> I disagree, and consensus appears to be so far to revert your patch until we
>>> can resolve this discussion.
>>>
>>> Is your position an objecting position? Would you object to the reversion of
>>> your patch for 2.25 until we can work out a better solution?
>>>
>>> In favor of reverting:
>>> Carlos O'Donell, Florian Weimer, Phil Blundell.
>>>
>>> Not in favor of reverting:
>>> H.J. Lu (non-objecting?)
>>>
>>
>> I object.
>
> https://sourceware.org/glibc/wiki/Consensus
>
> Let me ask again:
>
> Do you have a sustained (blocks consensus) opposition to the reverting of
> the patch for bug 20019 for the glibc 2.25 release which is scheduled to
> go out in 24 hours?
>
> Consensus need not imply unanimity.
>
> If your objection is non-sustained then it will be noted that you objected,
> but consensus carries that the patch should be reverted. And we continue to
> work on the issue _after_ the release.
>
> If your objection is sustained then we need to work through why you have a
> sustained objection.
>
> Success Criteria:
>
> * Release of glibc 2.25 without the side effect caused by the fix for 20019
> which prevents potentially valid applications from starting.
>
> Technical Problems:
>
> * Allow certain applications that use libpthread and longjmp to operate
> correctly.
>
> Solutions:
>
> (a) Revert the changes to libpthread which introduced the longjmp IFUNC.
>
> (b) Revert the fix for bug 20019 which stops the affected applications from
> starting.
It just silently ignores the potential crash when longjmp is called. I won't
call it a solution.
> (c) Implement IFUNC relocation ordering such that the applications work
> correctly in the presence of the libpthread longjmp IFUNC.
>
> Florian Weimer has stated that (c) is not ready for glibc 2.25 release which
> is tomorrow.
d)
Remove IFUNC from libpthread.so. The requirement for that the symbol
definition at run-time must come from the same shared object at link-time
is questionable.
> I argue that given that examples exist of libpthread and longjmp using programs
> which would now fail to start, when in the past they operated fine, that (b)
> is not sufficiently conservative for a project such as glibc. And so I propose
> we revert (b) and wait for (c) (along with a discussion of our design goals around
> IFUNC and what we do want to support).
>
> I also argue that (a) has unknown risks that we have not evaluated and will not
> easily be evaluated before the release date of 2017-02-01 (24 hours from now).
> I argue we don't have time to add back the longjmp code to libpthread and audit
> that (yes I'm spending time here arguing this).
>
> H.J., Could you please explain your position and why you object to (b)?
>
> Your present position as I understand it is:
>
> * It is safer that the user application does not start given the IFUNC
> defect in libpthread relocation handling order.
>
> * If a user really wants their application to start they should preload
> a DSO which provides a working longjmp.
>
> I find this position not sufficiently conservative for glibc.
>
> While failing safe appears to be a conservative position, you can't know if the
> entire system fails safe because this one application fails to start. We can't
> argue a fail safe position without knowing a broader context.
>
> Therefore I continue to express my opinion that we should revert the fix for bug
> 20019 and continue with the implementation as it is for 2.25, and look for a new fix
> that actually resolves the problem by (1) refining the IFUNC design and
> (2) implementing the required relocation ordering to support #1.
>
> Given this explanation do you still have a sustained objection to (b) above?
>
> --
> Cheers,
> Carlos.
--
H.J.