This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [MTASCsft PATCH 04/??] MT-, AS- and AC-safety docs: manual/charset.texi
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Alexandre Oliva <aoliva at redhat dot com>, codonell at redhat dot com
- Cc: libc-alpha at sourceware dot org
- Date: Wed, 29 Jan 2014 00:49:36 -0500
- Subject: Re: [MTASCsft PATCH 04/??] MT-, AS- and AC-safety docs: manual/charset.texi
- Authentication-results: sourceware.org; auth=none
- References: <ortxelb5zd dot fsf at livre dot home> <or4n4uoncj dot fsf at livre dot home> <ormwimn3cc dot fsf_-_ at livre dot home>
On 01/23/2014 10:08 AM, Alexandre Oliva wrote:
> There's some uncertainty here about the harmless race in mbsinit, and
> about whether that and other functions that take an optional mbstate_t
> and how to mark the safety issue there: they use an internal static
> buffer if the passed-in state is NULL (thus /!ps, and MT- and
> AS-Unsafe), but I haven't marked them with @mtsrace{:ps} in addition to
> it, for I figured it would be noisy and redundant to some extent.
OK to checkin if you add more comments.
> for ChangeLog
>
> * manual/charset.texi: Document MTASC-safety properties.
> ---
> manual/charset.texi | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 88 insertions(+)
>
> diff --git a/manual/charset.texi b/manual/charset.texi
> index a3e2577..46cee77 100644
> --- a/manual/charset.texi
> +++ b/manual/charset.texi
> @@ -504,6 +504,8 @@ sequence points. Communication protocols often require this.
> @comment wchar.h
> @comment ISO
> @deftypefun int mbsinit (const mbstate_t *@var{ps})
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +@c ps is dereferenced once, unguarded. Potential harmless data race.
More comments explaining why this is a harmless or not-harmless data race,
and why we don't mark everything up with @mtsrace{:ps}. We will consider
mbsinit's comment to be a central place to talk about all of this.
> The @code{mbsinit} function determines whether the state object pointed
> to by @var{ps} is in the initial state. If @var{ps} is a null pointer or
> the object is in the initial state the return value is nonzero. Otherwise
> @@ -559,6 +561,14 @@ that is beyond the range @math{0} to @math{127}.
> @comment wchar.h
> @comment ISO
> @deftypefun wint_t btowc (int @var{c})
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> +@c Calls btowc_fct or __fct; reads from locale, and from the
> +@c get_gconv_fcts result multiple times. get_gconv_fcts calls
> +@c __wcsmbs_load_conv to initialize the ctype if it's null.
> +@c wcsmbs_load_conv takes a non-recursive wrlock before allocating
> +@c memory for the fcts structure, initializing it, and then storing it
> +@c in the locale object. The initialization involves dlopening and a
> +@c lot more.
> The @code{btowc} function (``byte to wide character'') converts a valid
> single byte character @var{c} in the initial shift state into the wide
> character equivalent using the conversion rules from the currently
> @@ -615,6 +625,7 @@ There is also a function for the conversion in the other direction.
> @comment wchar.h
> @comment ISO
> @deftypefun int wctob (wint_t @var{c})
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{wctob} function (``wide character to byte'') takes as the
> parameter a valid wide character. If the multibyte representation for
> this character in the initial state is exactly one byte long, the return
> @@ -634,6 +645,7 @@ and they also do not require it to be in the initial state.
> @comment wchar.h
> @comment ISO
> @deftypefun size_t mbrtowc (wchar_t *restrict @var{pwc}, const char *restrict @var{s}, size_t @var{n}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:mbrtowc/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> @cindex stateful
> The @code{mbrtowc} function (``multibyte restartable to wide
> character'') converts the next multibyte character in the string pointed
> @@ -728,6 +740,7 @@ function that does part of the work.
> @comment wchar.h
> @comment ISO
> @deftypefun size_t mbrlen (const char *restrict @var{s}, size_t @var{n}, mbstate_t *@var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:mbrlen/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{mbrlen} function (``multibyte restartable length'') computes
> the number of at most @var{n} bytes starting at @var{s}, which form the
> next valid and complete multibyte character.
> @@ -811,6 +824,50 @@ doing the work twice.
> @comment wchar.h
> @comment ISO
> @deftypefun size_t wcrtomb (char *restrict @var{s}, wchar_t @var{wc}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:wcrtomb/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> +@c wcrtomb uses a static, non-thread-local unguarded state variable when
> +@c PS is NULL. When a state is passed in, and it's not used
> +@c concurrently in other threads, this function behaves safely as long
> +@c as gconv modules don't bring MT safety issues of their own.
> +@c Attempting to load gconv modules or to build conversion chains in
> +@c signal handlers may encounter gconv databases or caches in a
> +@c partially-updated state, and asynchronous cancellation may leave them
> +@c in such states, besides leaking the lock that guards them.
> +@c get_gconv_fcts ok
> +@c wcsmbs_load_conv ok
> +@c norm_add_slashes ok
> +@c wcsmbs_getfct ok
> +@c gconv_find_transform ok
> +@c gconv_read_conf (libc_once)
> +@c gconv_lookup_cache ok
> +@c find_module_idx ok
> +@c find_module ok
> +@c gconv_find_shlib (ok)
> +@c ->init_fct (assumed ok)
> +@c gconv_get_builtin_trans ok
> +@c gconv_release_step ok
> +@c do_lookup_alias ok
> +@c find_derivation ok
> +@c derivation_lookup ok
> +@c increment_counter ok
> +@c gconv_find_shlib ok
> +@c step->init_fct (assumed ok)
> +@c gen_steps ok
> +@c gconv_find_shlib ok
> +@c dlopen (presumed ok)
> +@c dlsym (presumed ok)
> +@c step->init_fct (assumed ok)
> +@c step->end_fct (assumed ok)
> +@c gconv_get_builtin_trans ok
> +@c gconv_release_step ok
> +@c add_derivation ok
> +@c gconv_close_transform ok
> +@c gconv_release_step ok
> +@c step->end_fct (assumed ok)
> +@c gconv_release_shlib ok
> +@c dlclose (presumed ok)
> +@c gconv_release_cache ok
> +@c ->tomb->__fct (assumed ok)
> The @code{wcrtomb} function (``wide character restartable to
> multibyte'') converts a single wide character into a multibyte string
> corresponding to that wide character.
> @@ -955,6 +1012,7 @@ extensions that can help in some important situations.
> @comment wchar.h
> @comment ISO
> @deftypefun size_t mbsrtowcs (wchar_t *restrict @var{dst}, const char **restrict @var{src}, size_t @var{len}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:mbsrtowcs/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{mbsrtowcs} function (``multibyte string restartable to wide
> character string'') converts a NUL-terminated multibyte character
> string at @code{*@var{src}} into an equivalent wide character string,
> @@ -1039,6 +1097,7 @@ length and passing this length to the function.
> @comment wchar.h
> @comment ISO
> @deftypefun size_t wcsrtombs (char *restrict @var{dst}, const wchar_t **restrict @var{src}, size_t @var{len}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:wcsrtombs/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{wcsrtombs} function (``wide character string restartable to
> multibyte string'') converts the NUL-terminated wide character string at
> @code{*@var{src}} into an equivalent multibyte character string and
> @@ -1084,6 +1143,7 @@ array size (the @var{len} parameter).
> @comment wchar.h
> @comment GNU
> @deftypefun size_t mbsnrtowcs (wchar_t *restrict @var{dst}, const char **restrict @var{src}, size_t @var{nmc}, size_t @var{len}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:mbsnrtowcs/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{mbsnrtowcs} function is very similar to the @code{mbsrtowcs}
> function. All the parameters are the same except for @var{nmc}, which is
> new. The return value is the same as for @code{mbsrtowcs}.
> @@ -1136,6 +1196,7 @@ of the given buffer, there is no problem with altering the state.
> @comment wchar.h
> @comment GNU
> @deftypefun size_t wcsnrtombs (char *restrict @var{dst}, const wchar_t **restrict @var{src}, size_t @var{nwc}, size_t @var{len}, mbstate_t *restrict @var{ps})
> +@safety{@prelim{}@mtunsafe{@mtasurace{:wcsnrtombs/!ps}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{wcsnrtombs} function implements the conversion from wide
> character strings to multibyte character strings. It is similar to
> @code{wcsrtombs} but, just like @code{mbsnrtowcs}, it takes an extra
> @@ -1280,6 +1341,7 @@ conversion functions.}
> @comment stdlib.h
> @comment ISO
> @deftypefun int mbtowc (wchar_t *restrict @var{result}, const char *restrict @var{string}, size_t @var{size})
> +@safety{@prelim{}@mtunsafe{@mtasurace{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{mbtowc} (``multibyte to wide character'') function when called
> with non-null @var{string} converts the first multibyte character
> beginning at @var{string} to its corresponding wide character code. It
> @@ -1314,6 +1376,7 @@ shift state. @xref{Shift State}.
> @comment stdlib.h
> @comment ISO
> @deftypefun int wctomb (char *@var{string}, wchar_t @var{wchar})
> +@safety{@prelim{}@mtunsafe{@mtasurace{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{wctomb} (``wide character to multibyte'') function converts
> the wide character code @var{wchar} to its corresponding multibyte
> character sequence, and stores the result in bytes starting at
> @@ -1353,6 +1416,7 @@ terms of @code{mbtowc}.
> @comment stdlib.h
> @comment ISO
> @deftypefun int mblen (const char *@var{string}, size_t @var{size})
> +@safety{@prelim{}@mtunsafe{@mtasurace{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{mblen} function with a non-null @var{string} argument returns
> the number of bytes that make up the multibyte character beginning at
> @var{string}, never examining more than @var{size} bytes. (The idea is
> @@ -1391,6 +1455,9 @@ suffer from the same problems as their reentrant counterparts from
> @comment stdlib.h
> @comment ISO
> @deftypefun size_t mbstowcs (wchar_t *@var{wstring}, const char *@var{string}, size_t @var{size})
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> +@c Odd... Although this was supposed to be non-reentrant, the internal
> +@c state is not a static buffer, but an automatic variable.
> The @code{mbstowcs} (``multibyte string to wide character string'')
> function converts the null-terminated string of multibyte characters
> @var{string} to an array of wide character codes, storing not more than
> @@ -1431,6 +1498,7 @@ mbstowcs_alloc (const char *string)
> @comment stdlib.h
> @comment ISO
> @deftypefun size_t wcstombs (char *@var{string}, const wchar_t *@var{wstring}, size_t @var{size})
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> The @code{wcstombs} (``wide character string to multibyte string'')
> function converts the null-terminated wide character array @var{wstring}
> into a string containing multibyte characters, storing not more than
> @@ -1618,6 +1686,16 @@ The first step is the function to create a handle.
> @comment iconv.h
> @comment XPG2
> @deftypefun iconv_t iconv_open (const char *@var{tocode}, const char *@var{fromcode})
> +@safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
> +@c Calls malloc if tocode and/or fromcode are too big for alloca. Calls
> +@c strip and upstr on both, then gconv_open. strip and upstr call
> +@c isalnum_l and toupper_l with the C locale. gconv_open may MT-safely
> +@c tokenize toset, replace unspecified codesets with the current locale
> +@c (possibly two different accesses), and finally it calls
> +@c gconv_find_transform and initializes the gconv_t result with all the
> +@c steps in the conversion sequence, running each one's initializer,
> +@c destructing and releasing them all if anything fails.
> +
> The @code{iconv_open} function has to be used before starting a
> conversion. The two parameters this function takes determine the
> source and destination character set for the conversion, and if the
> @@ -1682,6 +1760,12 @@ conversion is not needed anymore.
> @comment iconv.h
> @comment XPG2
> @deftypefun int iconv_close (iconv_t @var{cd})
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{}}}
> +@c Calls gconv_close to destruct and release each of the conversion
> +@c steps, release the gconv_t object, then call gconv_close_transform.
> +@c Access to the gconv_t object is not guarded, but calling iconv_close
> +@c concurrently with any other use is undefined.
> +
> The @code{iconv_close} function frees all resources associated with the
> handle @var{cd}, which must have been returned by a successful call to
> the @code{iconv_open} function.
> @@ -1708,6 +1792,10 @@ even file to file can be implemented on top of it.
> @comment iconv.h
> @comment XPG2
> @deftypefun size_t iconv (iconv_t @var{cd}, char **@var{inbuf}, size_t *@var{inbytesleft}, char **@var{outbuf}, size_t *@var{outbytesleft})
> +@safety{@prelim{}@mtsafe{@mtsrace{:cd}}@assafe{}@acunsafe{@acucorrupt{}}}
> +@c Without guarding access to the iconv_t object pointed to by cd, call
> +@c the conversion function to convert inbuf or flush the internal
> +@c conversion state.
> @cindex stateful
> The @code{iconv} function converts the text in the input buffer
> according to the rules associated with the descriptor @var{cd} and
>
Cheers,
Carlos.