This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/2] Add futex wrappers with error checking
- From: Torvald Riegel <triegel at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Mon, 05 Jan 2015 23:53:22 +0100
- Subject: Re: [PATCH 1/2] Add futex wrappers with error checking
- Authentication-results: sourceware.org; auth=none
- References: <1417726487 dot 22797 dot 48 dot camel at triegel dot csb> <20141205003310 dot 3A8B52C3A73 at topped-with-meat dot com> <1417804656 dot 22797 dot 107 dot camel at triegel dot csb> <20141212011840 dot 2B99A2C3ADB at topped-with-meat dot com> <1418774081 dot 7165 dot 71 dot camel at triegel dot csb> <20141217010728 dot 13CEC2C2446 at topped-with-meat dot com> <1418835610 dot 25358 dot 24 dot camel at triegel dot csb> <20141217233441 dot 81ACF2C3ABF at topped-with-meat dot com>
On Wed, 2014-12-17 at 15:34 -0800, Roland McGrath wrote:
> > It is. I do want to have the low-level locks check the return values of
> > their futex calls. To me, that doesn't conflict with having another
> > interface that is just what the underlying kernels/... provide. We can
> > also get rid of this other interface and do error checking in each of
> > the per kernel/... implementations. Does that clarify what I meant?
>
> Yes, that makes sense. On further reflection, I think it does make sense
> to have the error-checking layer be outside the OS-encapsulation layer so
> that we have foolproof consistency of the expected error protocol of the OS
> layer.
OK.
> > What would this mean for the new semaphore I posted? Are you okay with
> > it using the current lll_futex_* operations and custom error checking?
>
> If that is fixing some real, observable bug independent of the futex error
> handling question, then I think it's fine as an intermediate step to do
> now.
It is fixing a correctness bug.
Carlos said he's currently reviewing that, so I'll try to focus on doing
more of the futex clean-up now, hoping that I get enough of it done
before the freeze (and with time for you to review) to make adding the
new internal futex interface fine. Sounds good?
I'm traveling this week and thus don't have lots of time to spend on
this, but I'll see how far I can get.
> > I would also hope that we can remove all of them. But actually doing
> > that may be a bigger project. The futex-using assembly files I see are
> > on x86, x86_64, and sh. Its the low-level lock stuff (including
> > robust), pthread barrier, pthread condvar, pthread rwlock, pthread
> > semaphore, cancellation.
>
> The 6-argument syscall issue on i386 is the only blocker we know of, right?
> But indeed we wanted to be conservative about perturbing performance issues
> for i386/x86_64, and just establishing confidence about that will take some
> time. I had hoped someone like you would have done that during this cycle,
> but since actual focus on it is only beginning about now, it might well
> need to wait at least until right after the release.
No, I haven't, sorry.
i386 6-argument syscall is a blocker in terms of using Linux-generic
lowlevellock-futex.h everywhere. The other obstacle is that I haven't
heard back from the hppa, microblaze, and ia64 maintainers; it seems
okay to remove the custom lowlevellock.h on these archs, but I can't
test that easily. Without that removal, I can't do certain follow-up
cleanups like adding a futex_wait that accepts absolute timeouts in just
Linux-generic lowlevellock-futex.h.
> > Cancellation may be something that ends up doing futex calls from
> > assembly.
>
> Can you point to particular code you're thinking of?
AFAIR the plan we had for how to change cancellation, we wanted it to be
based on checking whether the instruction pointer falls into a certain
code region, not using the enable/disable calls we have now. We'd put
the syscall into a tightly-controlled code region whose addresses we'd
know, and then would be able to check whether cancellation happened in
there or not. I don't have pointers to the emails handy, but I'll look
for it.
OTOH, maybe that could just use a cancellable version of
INTERNAL_SYSCALL, and we'd be fine.
> > One thing we need to agree on is whether we need this interface in the
> > end at all, or whether we make the interface from 2) (see below) the one
> > that each OS implements. What do you think?
>
> On the one hand I want to avoid extra layers when possible. On the other
> hand, I am adamant about having any sanity-check enforcement of error
> protocols be done in code that is OS-independent so that we do not risk
> skew between OS-specific implementations of the sanity checks.
OK.
> Perhaps that right balance is to have some common code that does the error
> protocol assertions for each piece of the internal interface, but as
> utility code that the OS-specific layer calls rather than as an extra layer
> around it. Either of those two approaches should be fairly easy to
> refactor into the other later.
OK, and I agree that such refactoring seems straightforward. I'd start
with having the layering that I have now (with error checking in the
futex_* wrappers in my patches, that call the lll_futex_* functions for
now), if that's fine for you. This just seems to be easier to implement
incrementally. Also, we get to move uses off from the lll_futex_* names
into just futex_*, whcih seems right.
The next patches I plan to prepare are splitting the current
futex_timed_wait into two variants for relative and absolute timeouts,
respectively (absolute is used in pthread_*, relative is in aio_suspend,
for example). That needs an additional lll_futex_timed_wait variant,
but the upside is that we can use this to consolidate much duplicated
code in pthread_*, and start using the CLOCK_REALTIME optimization with
futex_timed_wait_bitset everywhere.
Next, I plan to transition more uses of lll_futex_* to futex_*.
If any of that sounds wrong to you, please let me know soon.