This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: glibc 2.23 --- Starting soft/slush freeze
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, Carlos O'Donell <codonell at redhat dot com>, David Miller <davem at davemloft dot net>
- Date: Wed, 13 Jan 2016 12:05:58 -0200
- Subject: Re: glibc 2.23 --- Starting soft/slush freeze
- Authentication-results: sourceware.org; auth=none
- References: <5694161C dot 1010200 at linaro dot org> <1452601558 dot 26597 dot 282 dot camel at localhost dot localdomain> <5694F34D dot 8070301 at linaro dot org> <1452610148 dot 26597 dot 309 dot camel at localhost dot localdomain> <56954792 dot 2030601 at linaro dot org> <1452631627 dot 26597 dot 326 dot camel at localhost dot localdomain>
On 12-01-2016 18:47, Torvald Riegel wrote:
> On Tue, 2016-01-12 at 16:36 -0200, Adhemerval Zanella wrote:
>>
>> On 12-01-2016 12:49, Torvald Riegel wrote:
>>> On Tue, 2016-01-12 at 10:36 -0200, Adhemerval Zanella wrote:
>>>>
>>>> On 12-01-2016 10:25, Torvald Riegel wrote:
>>>> The only bit I am not sure is the sparc stated by you it might
>>>> broke on newer implementation. Do you think we can get sparc parts done
>>>> and tested by the end of the week?
>>>
>>> The only thing that makes sparc different is that pre-v9 sparc32 has no
>>> real support for atomics. The atomic operations we have are supported,
>>> but CAS or other read-modify-write operations are only atomic within a
>>> process (the hardware only has a test-and-set, so things need to be
>>> lock-based essentially).
>>>
>>> For all but pre-v9 sparc, I'd just use the new barrier implementation.
>>> For pre-v9 sparc, I could try to hack up a version that uses an internal
>>> lock for all the read-modify-write atomic ops used by the new barrier
>>> implementation. But I can't test this.
>>>
>>> Dave, could you take care of adapting to pre-v9 sparc?
>>
>> Right, I also do not feel compelled on delay this feature due a very
>> specific architecture lacking of support. I recall Dave saying he only
>> build and test sparc near release date, so it will give him roughly a
>> month to either help adjust the code to pre-v9 sparc or help you with
>> testing. Any objections?
>
> No objections from me. What would you prefer for the patch?:
> (1) Remove all existing sparc code. This effectively breaks pre-v9
> process-shared barriers in the sense of not working at runtime.
> (2) Don't touch sparc at all. I believe this breaks builds on sparc.
> (3) Try the approach outlined below but leave any testing to Dave.
I would prefer to both remove the all existing sparc code, since they are
not correct w.r.t the new implementation, and issue an explicitly build
error on pre-v9 saying this is not supported at runtime to avoid silent
breaks.
>
>>>
>>> It also feels like we might need a better process to handle such pre-v9
>>> sparc issues. It comes up whenever we change concurrent code that is
>>> supposed to work in a process-shared setting (it will come up for
>>> condvars again). Any suggestions?
>>>
>>> I think one relatively clean way to support pre-v9 sparc on
>>> process-shared was if we could set/unset a lock that is to be used for
>>> subsequently executed atomic operations. The set/unset would be a noop
>>> for all but pre-v9 sparc. On pre-v9 sparc, we could put this address
>>> into a thread-local variable, and have the atomic RMW ops check whether
>>> such a lock is set or not; if it is, it will be used instead of one of
>>> the 64 per-process locks used for non-process-shared atomics.
>>> This way, we probably wouldn't need custom pre-v9 sparc code anymore.
>>> Performance may be somewhat slower, but it will be slower anyway due to
>>> having to use a lock in the first place instead of using a comparatively
>>> light-weight native atomic operation.
>>> Thoughts?
>>
>> This seems a reasonable approach, specially the part to get rid of custom
>> pre-v9 sparc code.
>
> OK.
>
>> I would also aim first on conformance and code simplicity
>> and thus later try on optimize this locking mechanism, if possible.
>
> Agreed.
>
>>>
>>>>>
>>>>> Also, what about the list of desirable features? I had assumed they are
>>>>> candidates for late commits too but will, unlike blockers, not block the
>>>>> release.
>>>>>
>>>>
>>>> I see only the new barrier and condvar implementation as the desirable
>>>> features on this release. From your previous messages I see that condvar
>>>> is still showing some issues with your internal tests, so I take we might
>>>> delay to 2.24 (correct me if I am wrong).
>>>
>>> I'm behind schedule on the condvar fix, but I could try to finish it
>>> this week too. Would this make the new condvar acceptable for 2.23?
>>> I don't want to rush this, but I worry about delaying to 2.24, still not
>>> getting any reviews, and being in the same position when the 2.24
>>> release comes around.
>>> Are there volunteers who would be willing and able to review the new
>>> condvar provided the fix gets posted upstream end of this week?
>>>
>>
>> How confident are you about the condvar implementation? I recall you saying
>> you tested on Fedore Rawhide, but it leads to some issues. I would prefer
>> to not rush on 2.23 and work to get it done subsequent months after master
>> is open again (so we have plenty time of fix until next releases).
>
> I identified the issue a while ago, and I think I understand it well.
> It was an oversight on my part: I didn't consider that spurious futex
> wake-ups can happen, which can, under certain specific uses of a
> condvar, lead to signals getting lost. In the original implementation,
> I had already implemented a countermeasure for essentially the same
> situation that can happen on cancellation, but the performance cost are
> too high to make this practical for the common case.
>
> I have a conceptual fix for it, which adds some complexity. I'll try to
> implement it this week, and then assess again how confident I am about
> it.
>
Fair enough, I will wait more input about new condvar, thanks.