This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Consensus on MT-, AS- and AC-Safety docs.


On Tue, 2013-11-26 at 18:28 -0200, Alexandre Oliva wrote:
> On Nov 25, 2013, Torvald Riegel <triegel@redhat.com> wrote:
> 
> > It's not a double-standard if we try to make things better in areas in
> > which can do things differently (e.g., new terms for MT-Safety).
> 
> We evidently disagree as to what is better in this regard.
> 
> > That we have this other kind of naming on different entities in a
> > different context (ie, one are function names, other are terms used in
> > documentation)
> 
> Both are terms that are fully defined in the documentation.  Short
> non-English words are extremely common in documents meant to document
> and teach the meaning of terms that people might be interested in, and
> they're defined precisely elsewhere.  Surely you are familiar with
> the concept of a dictionary.  Here's what an on-line dictionary has to
> say about the term itself:
> 
>   Dictionary \Dic"tion*a*ry\, n.; pl. {Dictionaries}.
>   [Cf. F. dictionnaire. See {Diction}.]  1. A book containing the words
>   of a language, arranged alphabetically, with explanations of their
>   meanings; [...] 2. [...]
> 
> There, ân.â, âpl.â, âCf.â, âF.â are all non-English terms used for
> convenience throughout the dictionary, and defined precisely elsewhere.
> Whoever first looks up a word in a dictionary might be unfamiliar with
> these terms, and have to look up their definition until they get
> sufficiently acquainted and comfortable with them.
> 
> It's obviously not the case that dictionary users look up words because
> they want to become familiar with the abbreviations used in the
> dictionary; they want to learn the meanings of words they're not
> familiar with.  But that the dictionary uses non-English forms, without
> linking to them at every use, doesn't seem to have ever been a problem.
> 
> Why should that be a problem for us?

I'm not going to reply to this one detail (obviously, are we building a
dictionary here? No. is space as scarce as in a traditional dictionary?
No. Et cetera...).  I feel like this has become a whack-a-mole contest,
where you bring up unrelated cases (e.g., warning signs, function names)
elsewhere and then argue that because these other cases either work or
just exist, we should do the same thing here.  I'll continue to
participate in the discussion when it's actually about what is best in
our case.

> >> Exactly!  They want to learn *exactly* what those odd combinations of
> >> letters mean!  Somehow when it's a function name among other made-up
> >> terms from C, that's perfectly readable,
> 
> > Why do you think those are perfectly readable?
> 
> Linguistics: because we're already used to them.

That we're already used to them, and learned to deal with them, doesn't
mean at all that they are perfectly readable.

> Which is not something
> we can say about the target audience of the manual, who wants to read it
> *precisely* because they're not already familiar with these terms.
> 
> >> but when it's a one-line table
> >> among other equally made up terms, then it can't work, and you're
> >> willing to sacrifice precision to avoid using them?!?
> 
> > We are not sacrificing precision.  Even your made-up names aren't
> > self-contained definitions, and you still need to read the definition,
> > so where is the lack of precision?
> 
> The difference is that made-up terms pretty much require people to go
> look up the definition, and since they're made up, there won't be other
> meanings already associated with them.  This cannot be said of existing
> words: since they already have a meaning attached to them, readers may
> get the idea that they do not have to look them up to find out what they
> mean in that context.

I disagree, and I already explained why.

As a counter-example to what you said, you replied to Joseph's
suggestion of "environment" with, basically "What about 'environment'?".
That's seems exactly the case of an existing word used as term/keyword,
yet people will look up what it means.

> >> As a last attempt to come to an agreement, before I decide we're at an
> >> impasse, here's a list of keywords I'd be willing to adopt.
> 
> > (I don't think an ultimatum is a good end to a discussion.  Did others
> > choose this option?)
> 
> It's not an ultimatum.  Think of it as knocking on a door or calling
> someone and hearing the signals corresponding to rings on the other
> side, and not getting any answer.  If you think âok, just one more
> knock/ringâ, it's not an ultimatum, it's just a realization that the
> attempt to communicate is not working, and that there's no point in
> insisting on it any further.
> 
> Our present situation, although not characterized by a complete lack of
> response, amounts to each party of the discussion restating their
> unmovable position, without any progress whatsoever and without any
> perspective of progress.  Therefore, a similar assessment that there's
> no point in insisting on it makes just as much sense.

It sounded as if you would plan to drop out of the work if we can't get
unanimous consent for this one.  If I misinterpreted that, sorry.

> > Are these the keywords shown to the user, or the keywords used
> > internally?
> 
> The ones I propose are to be shown to users.
> 
> >> Current         Proposed
> >> staticbuf       race
> 
> > I'd prefer datarace for this.
> 
> This might give users incorrect ideas.  getpwent_r, for example, would
> be marked with this note, even though there's no actual data race in
> place; what there is is a potential race between threads to modify the
> internal iterator maintained (guarded by a lock) by the pwent-walking
> machinery.  There's no data race, and no inconsistency, but multiple
> threads iterating over the same list will be racing for entries, and
> multiple threads attempting to each running a separate iteration will
> step on each other's toes, in spite of all changes to the iterator being
> properly guarded by the functions that access and modify them.

Okay.  I see the point, but that opens up more question for me than
indicating to me that "race" is the best name.  I would classify the
example you gave as a high-level data race, because it would give
undefined behavior in the sense that it doesn't match the allowed
behavior for a sequential execution (ie, "step on each other's toes").
Also, I suppose programmers would have to guard against the iterator
problem in the same way that they would guard against any data race?

> >> lockleak        lock
> >> selfdeadlock    lock
> 
> > Merging these two seems fine from a user POV (but see further comments
> > below).
> 
> >> asynconsist     corrupt
> >> incansist       corrupt
> 
> > Do we have a better word for these inconsistencies?
> 
> This was the best (as far as I'm concerned) I could come up with.
> Inconsistent is too long, and partially-updated is even worse.

I would prefer inconsistent.  I could live with 4 more characters ;)
But corrupt would work too I guess.

> >> asmalloc        malloc
> >> asi18n          i18n
> 
> > Given that they will be used in the AS context, I suppose that's fine.
> 
> Actually, I was thinking of moving these out of the AS context, for they
> also imply AC problems.  I'm not sure how to deal with that yet.  The
> same applies to these two:
> 
> >> shlimb          dlopen
> >> uplugin         plugin
> 
> > No opinion so far.
> 
> >> oncesafe        init
> >> 1stcall         init
> 
> > Might be fine from a user POV (but see below).
> 
> >> uunguard        const
> >> xguargs         race
> >> tempchwd        cwd
> >> tempsig         sig
> >> tempterm        term
> >> stimer          timer
> >> glocale         locale
> 
> > No opinion so far (I wanted to reply quickly to the other points).
> 
> Thanks a lot for your concern, it's really appreciated.  I wish we could
> come to an agreement about this as quickly as possible, because I'm
> otherwise blocked, and I hope this can make 2.19.
> 
> >> envromt         env
> 
> > Out of curiosity, why not "environment"?  Was that just to save a few
> > characters?
> 
> Yup.  env is a pretty standard shortening for environment, and I hope
> the one-line table remains one-line ;-)

As a user, I wouldn't mind having to look at a second line at all.  What
do others think?

> >> fdleak          fd
> >> memleak         leak
> 
> > I don't quite understand the reasons behind the renaming for these two.
> > If both are leaks, why not keep fdleak and memleak?
> 
> Shortening and avoidance of non-English or non-standard terms.  fd a
> very commonly used shortening for file descriptor, which is what this
> issue is about, and that vacated the term leak so that it could be used
> in its most frequent sense.  That said, I suppose using mem or heap
> instead of both malloc and leak would work, too.

I agree on "mem" and "fd" being widely understood, so I thought adding
"leak" to qualify them a little further would be the right thing to do.
At least for "mem", there could be more issues than leaking allocations.

> >> unposix         posix
> >> 
> >> Can we all agree on these?
> 
> > I think it would be unfortunate to loose some of the insight you have
> > been gathering (eg, loose the distinction between self-deadlock and
> > potentially missing lock releases).
> 
> No loss there.  selfdeadlock was AS-only and lockleak was AC-only, so
> the distinction would remain.  Indeed, I'd probably still use different
> macros for them, just to make it easier to set them apart mechanically,
> should we decide to do so at a later time.

The above was meant as a general comment to all those categories, sorry
if that wasn't clear.

> The only classes that would actually become indistinguishable would be
> staticbuf and xguargs, and maybe some or all cases of uunguard.  The
> more I thought about these, the more I concluded the definitions were
> mostly hair-splitting; adding sub-notes that indicate what the race is
> about will make up for any apparent loss.
> 
> > can we keep the insights documented in detail, and show the users only
> > a reduced set of annotations (e.g., with some annotations merged into
> > one class as in your mapping)?
> 
> I'm not yet convinced we don't want to tell users what the safety
> problems are, so I'm definitely keeping the keywords and macros
> (possibly expanding to nothing), so deciding that is not at the top of
> my priority list; deciding the keywords is.

Good, thanks.  For internal use, I agree that the keywords don't really
matter right.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]