This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Thread-, Signal- and Cancellation-safety documentation


On Fri, 2013-06-07 at 17:41 -0300, Alexandre Oliva wrote:
> On Jun  7, 2013, Torvald Riegel <triegel@redhat.com> wrote:
> 
> > It's a discussion, not a movie; so yes, it can branch out.  But what's
> > the problem?  And it's an difficult topic, so why the surprise?
> 
> The main problems are loss of context and of time in pursuit of
> distractions and obsoleted branches.  Lack of a clear connection between
> the branches and the overall goal are the main reason for that.
> 
> > that's a reason to keep on discussing, not stop the discussion.
> 
> Maybe there's something in the discussion for you.  For me, there was,
> but there no longer is.  I'm happy with my understanding of thread
> safety under posix.  You aren't happy with yours, but that's your issue,
> not mine.

No.  You aren't happy with *my* understanding, and vice versa.  If you
don't care, that's your choice.  But I do care about how we deal with
such topics and what we present to users; therefore, I also do care
about your opinion.

> > I've tried hard to stick to the terminology that's common in shared
> > memory synchronization in memory models.

Note that there's an "and" missing in this sentence, between
synchronization and the memory models.

> 
> That's part of the problem.  There seems to be an underlying assumption
> that memory models are somehow connected with the thread safety
> requirements in posix.

How does that related to the terminology point I made?  It's all in the
field of shared memory synchronization -- it's about concurrent threads
after alll.

>   It's an appealing claim, which is why I took
> your word on it at first, but it proved to be false.  Let me demonstrate
> why:
> 
> P1) posix defines a memory model with explicit synchronization primitives
> 
> P2) user-exposed memory is serialized by the requirement of absence of
>     data races in the memory model
> 
> P3) (implementations of specified) interfaces are not required to use
>     memory unlses it is user-exposed and so subject to P2

So P3 is related to P2.  Otherwise, P2 wouldn't be in this sentence.

> 
> P4) thread safety under posix is a requirement applied to
>     (implementations of specified) interfaces
> 
> See?  The memory model (P1) already covers everything it could cover
> (P2) and nothing it couldn't possibly (P3), and thread safety (P4)
> covers precisely this part that the memory model couldn't possibly cover
> because it needs not use memory.

The model is related to thread safety too because the model puts
constraints on what applications are allowed to do.  See P3 above being
related to P2, and so on.
P4 specified that some functions (the thread-safe ones) give additional
guarantees that a program could not get if using non-thread-safe
functions.  Thus, if you want to look at the layers of models and
concepts here, thread-safe functions can be considered at one level
above or on the same level as synchronization facilities such as locks.
Both put constraints on how values are computed in a concurrent
execution.

The memory model is what defines how this works, and how this works
jointly.  For example, it sets up the initial definitions of per-thread
program order, something like a happens-before, etc.  We need those
concepts to be able to reason about allowed behavior in a concurrent
execution, especially if we want to derive a concurrent-execution
specification from the initial sequential specifications of functions.
If you would try to separate thread safety from an existing memory
model, you'd have to introduce those concepts again to be able to come
with a proper definition.  Why duplicate them?

Furthermore, thread-safe functions do need to consider the constraints
imposed by other things in the memory model such as synchronization with
locks (remember the discussion about whether we can cache in a thread or
not -- you said that we could if we would update the cache on all
synchronization calls).  So, even if you had the duplicates, they'd need
to be compatible with the memory model's constraints -- at which point
it should become clear that putting the thread-safety discussion on top
of the memory model is the right thing to do (ie, the memory model's
base concepts, but not necessarily any of the synchronization constructs
it also defines).

> Now, given that the thread safety specs are about internal
> implementation details, the only thing to go by is the specification of
> expected behavior.
> 
> So there, memory model is not missing and covers everything it could
> cover, and the specifications of behavior are there.  What is missing?

What about a formal (or in prose but at least precise) definition of
thread safety?

> Concrete examples rather than hand waving in the wrong abstraction
> layer, please!
> 
> > If we stop the discussion now, we indeed have wasted a significant
> > amount of time.
> 
> I didn't.  As I wrote, every bit in which I participated was useful.  I
> asked you to provide formal rather than vague claims to proceed, but you
> didn't.  At that point it ceases to be useful for me, and I step away
> until there's something in it for me again.

So you'd like to see a formal proof that a vague wording is incomplete?
To make a formal proof about something (or show an inconsistency
formally), you need to start with having something formal.  I asked you
about a formal definition of Posix thread safety, and you didn't provide
one.  I cannot come up with this formal definition because the standard
is vague from my perspective.  You claim that there is a definition,
essentially, that could be derived from the standard.

> > Fact is that we are NOT compliant with this model
> 
> That's irrelevant to the discussion of what thread safety means.  That's
> where the multiple branches of conversation become a problem.
> 
> > I really suggest comparing the difference between the partial ordering
> > we ensure in the current implementation and the total order that seems
> > to be required in the standard, and understand what and how other
> > memory model are specified, before you really drop that discussion.
> 
> Why would that be useful for me?

Because that should help to understand the subject matter.

> I have my own work to do, I don't want to do mine and yours.
> 
> > the internal implementation should very well have synchronization
> > that's compatible with the synchronization implementations used in the
> > language.
> 
> I'm very glad you agree.

I'm not sure you do, because that wouldn't match what you said
previously.  It would mean that the constraints on the implementation of
thread-safe functions have to consider the constraints and rules in the
programming language's memory model. Hence, that both are related.

> Now please realize that posix deliberately refrains from specifying
> internal implementation!

But it puts constraints on implementations, so it does specify
properties that implementations must ensure.

> It specifies interfaces and required behavior.
> 
> >> please start by demonstrating, based on the wording of the current
> >> standard, how come it doesn't define a total order among
> >> synchronization primitives.
> 
> > See above.  I never said that, so why should I?
> 
> You said it didn't, so it shouldn't be hard for you to provide a counter
> example.
> 
> > If you want an example for why the implementation doesn't, just look
> > at the locks.
> 
> The implementation of the synchronization primitives themselves is
> precisely what's supposed to establish this total order under the
> standard.  Their internal implementation details are not mandated by the
> standard.

Again: the current implementations DO NOT establish such a total order.

I mentioned what they the implementations do internally to EXPLAIN why
that makes sense, and why they do not establish the total order in
general.

> The counter example I'm asking for is about how the synchronization
> primitives, as defined in the standard, fail to establish a total order
> among invocations thereof (if that's what you claimed), or a
> clarification on what you actually meant when you said that no such
> total order was established by the standard.

I said that the implementations that we have do not establish this order
in general.  Some of them do, but often that's coincidence.  And this
makes complete sense and I can't

> > Yeah, I used the atomicity terminology that's common in shared-memory
> > synchronization
> 
> See, that's part of the problem.

Are you saying that it's a problem to stick to the terminology that's
common in the field of computer science that Posix threading clearly
falls into?

> Are you sure it means the same in posix?

Do you have a bumper sticker saying "NIH, and proud of it!" ? ;)

Seriously, it doesn't help anyone to deviate from the definitions common
in the field.  It especially doesn't help users.  The standard can of
course deviate in its terminology, but then it should better precisely
define what it means.  It doesn't define "atomic" AFAIK, and the uses of
"atomic" that I've seen so far do not seem to conflict with the

> > we can break up the atomicity of functions where necessary.
> 
> Sure.  Every sequential compound action can be broken up into multiple
> atomic actions.  That's tautologically correct, but completely useless.

Can I again point out that sequential specifications of what functions
do specify atomic state transitions?

> Point is, atomicity is not a requirement for thread safety.

So, for example, when we have two concurrently executing thread-safe
functions that both are specified to modify the same byte of state (ie,
they have a precondition and a specified transition for the state), does
that mean that one function can just see a few bits of the state
transition of the other?  This would be the case if atomicity would be
completely useless.  Or should they at least read the byte atomically?

Or are the thread-safe functions not supposed to make sense in cases
where they can access their state atomically?  IOW, they don't give
programmers more than non-thread-safe functions?

> So, atomicity is not a requirement, memory model is not a requirement.
> Those are the only two points you trying to bring into the meaning of
> thread safety to show it was incomplete.  They're both demonstrably not
> there.

No.  My arguments are across the whole thread and some of them in this
email again.  But I'm running out of ways of how to explain them, so I
don't really know how to continue if you think that they are
"demonstrably not there".

Perhaps others could chime in at this point, and say whether the
arguments make sense for them -- or not.  I believe that it is important
for the project that we understand the subject matter and understand
each other -- without that, I believe we won't be able to give clear
explanations and documentation to users.

> What now?  What can you *show* that's both missing and necessary for
> consistency of the standard?
> 
> > I've asked you a couple of times for a proper definition of thread
> > safety,
> 
> Function F is thread safe IFF it won't deviate from the specified
> behavior

Again, they have specified *sequential* behavior, expressed in terms of
an atomic transition from a set of states (ie, the preconditions) to
another (set of) state(s) (ie, the effects).  Thus, you need to do
*something more* to make sense of this in a concurrent execution, in
which state transitions aren't atomic per se.  You cannot just rely on
the program to provide the ability for the function to make this atomic
state transition, because we want to have thread-safe and not just
sequential functions.

These are all points I raised at the beginning of this discussion.

> , nor cause other thread-safe functions to deviate from their
> specified behavior, even in the presence of other threads, as long as
> they all refrain from invoking undefined behavior or calling
> thread-unsafe functions.
> 
> Your distractions in different abstraction layers have caused me to
> attempt to translate that into your layer.  I failed at that.
> 
> So, enough of my playing in your field, because it's not helping.  It's
> time for you to come to my field and try to demonstrate that the above
> is not consistent with the standard, or not self-consistent.
> 
> > I pointed out some of the issues already.
> 
> You only mentioned things that you wished were covered but aren't,
> and that I've proved weren't even supposed to be covered.
> 
> > If you wish to see a formal proof, why don't you (1) provide a proper
> > definition and (2) *prove* that it doesn't have loopholes that render
> > it useless?
> 
> Because I'm not the one who's claiming there's a problem in the
> standard.  The burden of showing that is yours, not mine.

It seems indeed like I can't explain the issues that I see in the
standard's definition to you.  It would be helpful if at this point in
the discussion, others can chime.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]