This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [MTASCsft PATCH 07/??] MT-, AS- and AC-Safety docs: manual/ctype.texi


On 01/23/2014 01:48 PM, Alexandre Oliva wrote:
> This patch has much of the vestigial records that glocale (now
> @mtslocale) was regarded as an MT-Safety problem.  The isw* functions,
> that access the current locale object in potentially unsafe ways (but
> that are optimized into perfectly safe accesses) are marked wtih
> @mtslocale, whereas the is[^w]* functions, that access the locale object
> safely, aren't marked at all, even though some other functions that use
> them within loops or so were (correctly) marked with @mtslocale, as
> their callers could behave inconsistently should the locale object
> change.
> 
> Although it wouldn't be unreasonable to leave the ctype safety marks as
> they are, we might want to mark even functions that access the locale
> object atomically, for the benefit of callers that might want to
> introduce coordination between functions marked with @mtslocale and
> @mtasurace:@mtslocale.
> 
> I'm not sure whether to just use @mtslocale, or to introduce a variant
> that indicates the function itself can be called safely even
> concurrently with locale changes, but that *compound* behavior may be
> surprising without coordination with @mtasurace:@mtslocale.
> 
> I've signaled this with @mtslocale^ (meaning @mtslocale affects the
> caller, but not this function) and even @mtslocale^^ (@mtslocale two
> levels up) in some comments in other files (not yet posted in this
> final-candidate round), but I'm planning on editing the â^âs out in this
> final round of reviews.

You did, which is fine.

> 
> Thoughts?
 
One nits. See below. We can't rely on optimization IMO.
 
> for ChangeLog
> 
> 	* manual/ctype.texi: Document MTASC-safety properties.
> ---
>  manual/ctype.texi |   66 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/manual/ctype.texi b/manual/ctype.texi
> index 3d13571..5bfbcec 100644
> --- a/manual/ctype.texi
> +++ b/manual/ctype.texi
> @@ -66,6 +66,13 @@ These functions are declared in the header file @file{ctype.h}.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int islower (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +@c The is* macros call __ctype_b_loc to get the ctype array from the
> +@c current locale, and then index it by c.  __ctype_b_loc reads from
> +@c thread-local memory the (indirect) pointer to the ctype array, which
> +@c may involve one word access to the global locale object, if that's
> +@c the active locale for the thread, and the array, being part of the
> +@c locale data, is undeletable, so there's no thread-safety issue.
>  Returns true if @var{c} is a lower-case letter.  The letter need not be
>  from the Latin alphabet, any alphabet representable is valid.
>  @end deftypefun
> @@ -74,6 +81,7 @@ from the Latin alphabet, any alphabet representable is valid.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isupper (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is an upper-case letter.  The letter need not be
>  from the Latin alphabet, any alphabet representable is valid.
>  @end deftypefun
> @@ -82,6 +90,7 @@ from the Latin alphabet, any alphabet representable is valid.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isalpha (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is an alphabetic character (a letter).  If
>  @code{islower} or @code{isupper} is true of a character, then
>  @code{isalpha} is also true.
> @@ -97,6 +106,7 @@ additional characters.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isdigit (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
>  @end deftypefun
>  
> @@ -104,6 +114,7 @@ Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isalnum (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is an alphanumeric character (a letter or
>  number); in other words, if either @code{isalpha} or @code{isdigit} is
>  true of a character, then @code{isalnum} is also true.
> @@ -113,6 +124,7 @@ true of a character, then @code{isalnum} is also true.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isxdigit (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a hexadecimal digit.
>  Hexadecimal digits include the normal decimal digits @samp{0} through
>  @samp{9} and the letters @samp{A} through @samp{F} and
> @@ -123,6 +135,7 @@ Hexadecimal digits include the normal decimal digits @samp{0} through
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int ispunct (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a punctuation character.
>  This means any printing character that is not alphanumeric or a space
>  character.
> @@ -132,6 +145,7 @@ character.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isspace (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a @dfn{whitespace} character.  In the standard
>  @code{"C"} locale, @code{isspace} returns true for only the standard
>  whitespace characters:
> @@ -161,6 +175,7 @@ vertical tab
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isblank (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a blank character; that is, a space or a tab.
>  This function was originally a GNU extension, but was added in @w{ISO C99}.
>  @end deftypefun
> @@ -169,6 +184,7 @@ This function was originally a GNU extension, but was added in @w{ISO C99}.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isgraph (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a graphic character; that is, a character
>  that has a glyph associated with it.  The whitespace characters are not
>  considered graphic.
> @@ -178,6 +194,7 @@ considered graphic.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int isprint (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a printing character.  Printing characters
>  include all the graphic characters, plus the space (@samp{ }) character.
>  @end deftypefun
> @@ -186,6 +203,7 @@ include all the graphic characters, plus the space (@samp{ }) character.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int iscntrl (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a control character (that is, a character that
>  is not a printing character).
>  @end deftypefun
> @@ -194,6 +212,7 @@ is not a printing character).
>  @comment ctype.h
>  @comment SVID, BSD
>  @deftypefun int isascii (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits
>  into the US/UK ASCII character set.  This function is a BSD extension
>  and is also an SVID extension.
> @@ -227,6 +246,10 @@ These functions are declared in the header file @file{ctype.h}.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int tolower (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +@c The to* macros/functions call different functions that use different
> +@c arrays than those of__ctype_b_loc, but the access patterns and
> +@c thus safety guarantees are the same.
>  If @var{c} is an upper-case letter, @code{tolower} returns the corresponding
>  lower-case letter.  If @var{c} is not an upper-case letter,
>  @var{c} is returned unchanged.
> @@ -235,6 +258,7 @@ lower-case letter.  If @var{c} is not an upper-case letter,
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int toupper (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  If @var{c} is a lower-case letter, @code{toupper} returns the corresponding
>  upper-case letter.  Otherwise @var{c} is returned unchanged.
>  @end deftypefun
> @@ -242,6 +266,7 @@ upper-case letter.  Otherwise @var{c} is returned unchanged.
>  @comment ctype.h
>  @comment SVID, BSD
>  @deftypefun int toascii (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  This function converts @var{c} to a 7-bit @code{unsigned char} value
>  that fits into the US/UK ASCII character set, by clearing the high-order
>  bits.  This function is a BSD extension and is also an SVID extension.
> @@ -250,6 +275,7 @@ bits.  This function is a BSD extension and is also an SVID extension.
>  @comment ctype.h
>  @comment SVID
>  @deftypefun int _tolower (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  This is identical to @code{tolower}, and is provided for compatibility
>  with the SVID.  @xref{SVID}.@refill
>  @end deftypefun
> @@ -257,6 +283,7 @@ with the SVID.  @xref{SVID}.@refill
>  @comment ctype.h
>  @comment SVID
>  @deftypefun int _toupper (int @var{c})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  This is identical to @code{toupper}, and is provided for compatibility
>  with the SVID.
>  @end deftypefun
> @@ -303,6 +330,15 @@ This type is defined in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun wctype_t wctype (const char *@var{property})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
> +@c Although the source code of wctype contains multiple references to
> +@c the locale, that could each reference different locale_data objects
> +@c should the global locale object change while active, the compiler can
> +@c and does combine them all into a single dereference that resolves
> +@c once to the LCTYPE locale object used throughout the function, so it
> +@c is safe in practice, if not in theory.  Ideally we'd explicitly save
> +@c the resolved locale_data object to make it visibly safe instead of
> +@c safe only under compiler optimizations.

We support building glibc at various optimization levels including -Os
which may not do the kind of optimizations you expect. If it's only safe
because of the compiler then we need to mark it as unsafe to be
conservative. While it's impossible to build *all* of glibc at -O0, you
could conceivably compile parts of it a -O0 and the safety marks should
remain correct.

>  The @code{wctype} returns a value representing a class of wide
>  characters which is identified by the string @var{property}.  Beside
>  some standard properties each locale can define its own ones.  In case
> @@ -331,6 +367,8 @@ the @w{ISO C} standard defines a completely new function.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswctype (wint_t @var{wc}, wctype_t @var{desc})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +@c The compressed lookup table returned by wctype is read-only.
>  This function returns a nonzero value if @var{wc} is in the character
>  class specified by @var{desc}.  @var{desc} must previously be returned
>  by a successful call to @code{wctype}.
> @@ -350,6 +388,15 @@ standard classes.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswalnum (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
> +@c The implicit wctype call in the isw* functions is actually an
> +@c optimized version because the category has a known offset, but the
> +@c wctype is equally safe when optimized, unsafe if not optimized.
> +@c Since it's not a macro, and we always optimize, it's fine.  The test
> +@c whether wc is ASCII to use the non-wide is* macro/funciton doesn't
> +@c bring any other safety issues: the test does not depend on the
> +@c locale, and each path after the decision resolves the locale object
> +@c only once.
>  This function returns a nonzero value if @var{wc} is an alphanumeric
>  character (a letter or number); in other words, if either @code{iswalpha}
>  or @code{iswdigit} is true of a character, then @code{iswalnum} is also
> @@ -370,6 +417,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswalpha (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is an alphabetic character (a letter).  If
>  @code{iswlower} or @code{iswupper} is true of a character, then
>  @code{iswalpha} is also true.
> @@ -394,6 +442,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswcntrl (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a control character (that is, a character that
>  is not a printing character).
>  
> @@ -412,6 +461,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswdigit (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a digit (e.g., @samp{0} through @samp{9}).
>  Please note that this function does not only return a nonzero value for
>  @emph{decimal} digits, but for all kinds of digits.  A consequence is
> @@ -442,6 +492,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswgraph (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a graphic character; that is, a character
>  that has a glyph associated with it.  The whitespace characters are not
>  considered graphic.
> @@ -461,6 +512,7 @@ It is declared in @file{wctype.h}.
>  @comment ctype.h
>  @comment ISO
>  @deftypefun int iswlower (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a lower-case letter.  The letter need not be
>  from the Latin alphabet, any alphabet representable is valid.
>  
> @@ -479,6 +531,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswprint (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a printing character.  Printing characters
>  include all the graphic characters, plus the space (@samp{ }) character.
>  
> @@ -497,6 +550,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswpunct (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a punctuation character.
>  This means any printing character that is not alphanumeric or a space
>  character.
> @@ -516,6 +570,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswspace (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a @dfn{whitespace} character.  In the standard
>  @code{"C"} locale, @code{iswspace} returns true for only the standard
>  whitespace characters:
> @@ -555,6 +610,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswupper (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is an upper-case letter.  The letter need not be
>  from the Latin alphabet, any alphabet representable is valid.
>  
> @@ -573,6 +629,7 @@ It is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswxdigit (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a hexadecimal digit.
>  Hexadecimal digits include the normal decimal digits @samp{0} through
>  @samp{9} and the letters @samp{A} through @samp{F} and
> @@ -597,6 +654,7 @@ characters as well.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun int iswblank (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  Returns true if @var{wc} is a blank character; that is, a space or a tab.
>  This function was originally a GNU extension, but was added in @w{ISO C99}.
>  It is declared in @file{wchar.h}.
> @@ -691,6 +749,8 @@ This type is defined in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun wctrans_t wctrans (const char *@var{property})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
> +@c Similar implementation, same caveats as wctype.
>  The @code{wctrans} function has to be used to find out whether a named
>  mapping is defined in the current locale selected for the
>  @code{LC_CTYPE} category.  If the returned value is non-zero, you can use
> @@ -713,6 +773,8 @@ These functions are declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +@c Same caveats as iswctype.
>  @code{towctrans} maps the input character @var{wc}
>  according to the rules of the mapping for which @var{desc} is a
>  descriptor, and returns the value it finds.  @var{desc} must be
> @@ -730,6 +792,9 @@ for them.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun wint_t towlower (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
> +@c Same caveats as iswalnum, just using a wctrans rather than a wctype
> +@c table.
>  If @var{wc} is an upper-case letter, @code{towlower} returns the corresponding
>  lower-case letter.  If @var{wc} is not an upper-case letter,
>  @var{wc} is returned unchanged.
> @@ -749,6 +814,7 @@ This function is declared in @file{wctype.h}.
>  @comment wctype.h
>  @comment ISO
>  @deftypefun wint_t towupper (wint_t @var{wc})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
>  If @var{wc} is a lower-case letter, @code{towupper} returns the corresponding
>  upper-case letter.  Otherwise @var{wc} is returned unchanged.
>  
> 

OK to checkin, if you adjust the 1 or 2 markers here that rely on optimization
for being safe.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]