This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Remove unnecessary locking when reading iconv configuration [BZ #22062]
- From: Carlos O'Donell <carlos at redhat dot com>
- To: Arjun Shankar <arjun dot is at lostca dot se>, GNU C Library <libc-alpha at sourceware dot org>
- Cc: Florian Weimer <fweimer at redhat dot com>
- Date: Tue, 31 Oct 2017 22:01:13 -0700
- Subject: Re: [PATCH] Remove unnecessary locking when reading iconv configuration [BZ #22062]
- Authentication-results: sourceware.org; auth=none
- References: <20171026111535.GA79519@aloka.lostca.se> <80556783-cee1-2317-0e43-d90e4fce88ed@redhat.com>
On 10/26/2017 04:15 AM, Arjun Shankar wrote:> In iconv/gconv_conf.c, __gconv_get_path unnecessarily obtains a lock when
> populating the array pointed to by __gconv_path_elem. The locking is not
> necessary because all calls to __gconv_read_conf (which in turn calls
> __gconv_get_path) are serialized using __libc_once.
This version is looking much better! Surprising how a small change can turn
into a larger refactoring.
I think we are almost done, a v3 and we'll be ready to commit. Please see
the comments below.
> This patch:
> - removes all locking in __gconv_get_path;
> - replaces all explicitly serialized __gconv_read_conf calls with calls to
> __gconv_load_conf, a new wrapper that is serialized internally;
> - adds a new test, iconv/tst-iconv_mt.c, to exercise iconv initialization,
> usage, and cleanup in a multi-threaded program;
> - indents __gconv_get_path correctly, removing tab characters (which makes
> the patch look a little bigger than it really is).
The above looks like a good commit message. So I'll assume that's what you're
going to use.
> 2017-10-26 Arjun Shankar <arjun@redhat.com>
>
> [BZ #22062]
> * iconv/gconv_conf.c (__gconv_get_path): Remove locking and fix
> indentation.
> * (__gconv_read_conf): Mark function static.
> * (once): New static variable.
> * (__gconv_load_conf): New function.
> * iconv/gconv_int.h (__gconv_load_conf): Likewise.
> * iconv/gconv_db.c (once): Remove static variable.
> * (__gconv_compare_alias): Use __gconv_load_conf instead of
> __gconv_read_conf.
> * (__gconv_find_transform): Likewise.
> * iconv/tst-iconv_mt.c: New test.
> * iconv/Makefile: Add tst-iconv_mt.
> ---
> This patch takes the direction suggested during review of a previous submission:
> https://sourceware.org/ml/libc-alpha/2017-10/msg00428.html
> Given that the previous patch was based on an incorrect view, this patch isn't a
> v3 but a re-work. After removing the unnecessary locking, I have confirmed
> that the test case fails if the relevant __libc_once is removed. I have also
> confirmed using gdb's all-stop mode that in this case, the assert I have added
> to __gconv_get_path fails. This gives me high confidence that the test will
> guard against regressions relating to iconv and multi-threading.
This should also go in the commit message.
I assume also that x86_64 built without regressions.
> iconv/Makefile | 4 +-
> iconv/gconv_conf.c | 207 ++++++++++++++++++++++++++-------------------------
> iconv/gconv_db.c | 8 +-
> iconv/gconv_int.h | 4 +-
> iconv/tst-iconv_mt.c | 146 ++++++++++++++++++++++++++++++++++++
> 5 files changed, 260 insertions(+), 109 deletions(-)
> create mode 100644 iconv/tst-iconv_mt.c
>
> diff --git a/iconv/Makefile b/iconv/Makefile
> index d340565..9d84c1d 100644
> --- a/iconv/Makefile
> +++ b/iconv/Makefile
> @@ -43,7 +43,7 @@ CFLAGS-charmap.c = -DCHARMAP_PATH='"$(i18ndir)/charmaps"' \
> CFLAGS-linereader.c = -DNO_TRANSLITERATION
> CFLAGS-simple-hash.c = -I../locale
>
> -tests = tst-iconv1 tst-iconv2 tst-iconv3 tst-iconv4 tst-iconv5 tst-iconv6
> +tests = tst-iconv1 tst-iconv2 tst-iconv3 tst-iconv4 tst-iconv5 tst-iconv6 tst-iconv_mt
Long line. Please wrap.
>
> others = iconv_prog iconvconfig
> install-others-programs = $(inst_bindir)/iconv
> @@ -67,6 +67,8 @@ endif
> $(objpfx)gconv-modules: test-gconv-modules
> cp $< $@
>
> +$(objpfx)tst-iconv_mt: $(shared-thread-library)
OK.
> +
> ifeq (yes,$(build-shared))
> tests += tst-gconv-init-failure
> modules-names += tst-gconv-init-failure-mod
> diff --git a/iconv/gconv_conf.c b/iconv/gconv_conf.c
> index f1c28ce..d5cf86e 100644
> --- a/iconv/gconv_conf.c
> +++ b/iconv/gconv_conf.c
> @@ -423,115 +423,107 @@ read_conf_file (const char *filename, const char *directory, size_t dir_len,
> void
> __gconv_get_path (void)
Needs comment explaining that this should only be called by
__gconv_read_conf.
Can this be made static? And removed from gconv_int.h?
> {
> - struct path_elem *result;
Florian made comments about compiler usage of result.
If anything keeping result just makes the code slightly
easier to read because __gconv_path_elem is *such* a mouthful
of a variable to read and write e.g. 6 chars vs 17.
So I suggest keeping result just because it's less typing
and the compiler will optimize it away. It also means
less code rewriting here, making it easier to audit.
> - __libc_lock_define_initialized (static, lock);
> -
> - __libc_lock_lock (lock);
> -
> - /* Make sure there wasn't a second thread doing it already. */
> - result = (struct path_elem *) __gconv_path_elem;
> - if (result == NULL)
> + /* Make sure we haven't done it already. */
> + assert (__gconv_path_elem == NULL);
> +
> + /* Determine the complete path first. */
> + char *gconv_path;
> + size_t gconv_path_len;
> + char *elem;
> + char *oldp;
> + char *cp;
> + int nelems;
> + char *cwd;
> + size_t cwdlen;
> +
> + if (__gconv_path_envvar == NULL)
> {
> - /* Determine the complete path first. */
> - char *gconv_path;
> - size_t gconv_path_len;
> - char *elem;
> - char *oldp;
> - char *cp;
> - int nelems;
> - char *cwd;
> - size_t cwdlen;
> -
> - if (__gconv_path_envvar == NULL)
> - {
> - /* No user-defined path. Make a modifiable copy of the
> - default path. */
> - gconv_path = strdupa (default_gconv_path);
> - gconv_path_len = sizeof (default_gconv_path);
> - cwd = NULL;
> - cwdlen = 0;
> - }
> - else
> - {
> - /* Append the default path to the user-defined path. */
> - size_t user_len = strlen (__gconv_path_envvar);
> -
> - gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
> - gconv_path = alloca (gconv_path_len);
> - __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
> - user_len),
> - ":", 1),
> - default_gconv_path, sizeof (default_gconv_path));
> - cwd = __getcwd (NULL, 0);
> - cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
> - }
> - assert (default_gconv_path[0] == '/');
> -
> - /* In a first pass we calculate the number of elements. */
> - oldp = NULL;
> - cp = strchr (gconv_path, ':');
> - nelems = 1;
> - while (cp != NULL)
> - {
> - if (cp != oldp + 1)
> - ++nelems;
> - oldp = cp;
> - cp = strchr (cp + 1, ':');
> - }
> -
> - /* Allocate the memory for the result. */
> - result = (struct path_elem *) malloc ((nelems + 1)
> - * sizeof (struct path_elem)
> - + gconv_path_len + nelems
> - + (nelems - 1) * (cwdlen + 1));
> - if (result != NULL)
> - {
> - char *strspace = (char *) &result[nelems + 1];
> - int n = 0;
> -
> - /* Separate the individual parts. */
> - __gconv_max_path_elem_len = 0;
> - elem = __strtok_r (gconv_path, ":", &gconv_path);
> - assert (elem != NULL);
> - do
> - {
> - result[n].name = strspace;
> - if (elem[0] != '/')
> - {
> - assert (cwd != NULL);
> - strspace = __mempcpy (strspace, cwd, cwdlen);
> - *strspace++ = '/';
> - }
> - strspace = __stpcpy (strspace, elem);
> - if (strspace[-1] != '/')
> - *strspace++ = '/';
> -
> - result[n].len = strspace - result[n].name;
> - if (result[n].len > __gconv_max_path_elem_len)
> - __gconv_max_path_elem_len = result[n].len;
> -
> - *strspace++ = '\0';
> - ++n;
> - }
> - while ((elem = __strtok_r (NULL, ":", &gconv_path)) != NULL);
> -
> - result[n].name = NULL;
> - result[n].len = 0;
> - }
> + /* No user-defined path. Make a modifiable copy of the
> + default path. */
> + gconv_path = strdupa (default_gconv_path);
> + gconv_path_len = sizeof (default_gconv_path);
> + cwd = NULL;
> + cwdlen = 0;
> + }
> + else
> + {
> + /* Append the default path to the user-defined path. */
> + size_t user_len = strlen (__gconv_path_envvar);
> +
> + gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
> + gconv_path = alloca (gconv_path_len);
> + __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
> + user_len),
> + ":", 1),
> + default_gconv_path, sizeof (default_gconv_path));
> + cwd = __getcwd (NULL, 0);
> + cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
> + }
> + assert (default_gconv_path[0] == '/');
>
> - __gconv_path_elem = result ?: (struct path_elem *) &empty_path_elem;
> + /* In a first pass we calculate the number of elements. */
> + oldp = NULL;
> + cp = strchr (gconv_path, ':');
> + nelems = 1;
> + while (cp != NULL)
> + {
> + if (cp != oldp + 1)
> + ++nelems;
> + oldp = cp;
> + cp = strchr (cp + 1, ':');
> + }
>
> - free (cwd);
> + /* Allocate the memory for the result. */
> + __gconv_path_elem = malloc ((nelems + 1)
> + * sizeof (struct path_elem)
> + + gconv_path_len + nelems
> + + (nelems - 1) * (cwdlen + 1));
> + if (__gconv_path_elem != NULL)
> + {
> + char *strspace = (char *) &__gconv_path_elem[nelems + 1];
> + int n = 0;
> +
> + /* Separate the individual parts. */
> + __gconv_max_path_elem_len = 0;
> + elem = __strtok_r (gconv_path, ":", &gconv_path);
> + assert (elem != NULL);
> + do
> + {
> + __gconv_path_elem[n].name = strspace;
> + if (elem[0] != '/')
> + {
> + assert (cwd != NULL);
> + strspace = __mempcpy (strspace, cwd, cwdlen);
> + *strspace++ = '/';
> + }
> + strspace = __stpcpy (strspace, elem);
> + if (strspace[-1] != '/')
> + *strspace++ = '/';
> +
> + __gconv_path_elem[n].len = strspace - __gconv_path_elem[n].name;
> + if (__gconv_path_elem[n].len > __gconv_max_path_elem_len)
> + __gconv_max_path_elem_len = __gconv_path_elem[n].len;
> +
> + *strspace++ = '\0';
> + ++n;
> + }
> + while ((elem = __strtok_r (NULL, ":", &gconv_path)) != NULL);
> +
> + __gconv_path_elem[n].name = NULL;
> + __gconv_path_elem[n].len = 0;
> }
>
> - __libc_lock_unlock (lock);
> + if (__gconv_path_elem == NULL)
> + __gconv_path_elem = (struct path_elem *) &empty_path_elem;
> +
> + free (cwd);
OK.
> }
>
>
> /* Read all configuration files found in the user-specified and the default
> - path. */
> -void
> -attribute_hidden
> + path. This function is called only once during the program's lifetime so
> + we may disregard locking and synchronization. */
> +static void
OK. Like it when things are made static.
> __gconv_read_conf (void)
Likewise needs a comment that this is only safe to be called via __gconv_load_conf.
> {
> void *modules = NULL;
> @@ -602,6 +594,21 @@ __gconv_read_conf (void)
> }
>
>
> +/* This "once" variable is used to do a one-time load of the configuration. */
> +__libc_once_define (static, once);
> +
> +
> +/* Read all configuration files found in the user-specified and the default
> + path, but do it only "once" using __gconv_read_conf to do the actual
> + work. This is the function that must be called when reading iconv
> + configuration. */
> +void
> +attribute_hidden
Move attribute_hidden to the declaration in the header file.
> +__gconv_load_conf (void)
> +{
> + __libc_once (once, __gconv_read_conf);
> +}
> +
>
> /* Free all resources if necessary. */
> libc_freeres_fn (free_mem)
> diff --git a/iconv/gconv_db.c b/iconv/gconv_db.c
> index 8fcb3cd..2291510 100644
> --- a/iconv/gconv_db.c
> +++ b/iconv/gconv_db.c
> @@ -687,10 +687,6 @@ find_derivation (const char *toset, const char *toset_expand,
> }
>
>
> -/* Control of initialization. */
> -__libc_once_define (static, once);
> -
> -
OK.
> static const char *
> do_lookup_alias (const char *name)
> {
> @@ -709,7 +705,7 @@ __gconv_compare_alias (const char *name1, const char *name2)
> int result;
>
> /* Ensure that the configuration data is read. */
> - __libc_once (once, __gconv_read_conf);
> + __gconv_load_conf ();
OK.
>
> if (__gconv_compare_alias_cache (name1, name2, &result) != 0)
> result = strcmp (do_lookup_alias (name1) ?: name1,
> @@ -729,7 +725,7 @@ __gconv_find_transform (const char *toset, const char *fromset,
> int result;
>
> /* Ensure that the configuration data is read. */
> - __libc_once (once, __gconv_read_conf);
> + __gconv_load_conf ();
OK.
>
> /* Acquire the lock. */
> __libc_lock_lock (__gconv_lock);
> diff --git a/iconv/gconv_int.h b/iconv/gconv_int.h
> index 2afd12a..e56757a 100644
> --- a/iconv/gconv_int.h
> +++ b/iconv/gconv_int.h
> @@ -196,8 +196,8 @@ extern int __gconv_compare_alias_cache (const char *name1, const char *name2,
> extern void __gconv_release_step (struct __gconv_step *step)
> attribute_hidden;
>
Also remove __gconv_get_path?
> -/* Read all the configuration data and cache it. */
> -extern void __gconv_read_conf (void) attribute_hidden;
> +/* Read all the configuration data and cache it if not done so already. */
> +extern void __gconv_load_conf (void) attribute_hidden;
OK.
>
> /* Try to read module cache file. */
> extern int __gconv_load_cache (void) attribute_hidden;
> diff --git a/iconv/tst-iconv_mt.c b/iconv/tst-iconv_mt.c
> new file mode 100644
> index 0000000..5fc1de4
> --- /dev/null
> +++ b/iconv/tst-iconv_mt.c
> @@ -0,0 +1,146 @@
> +/* Test that iconv works in a multi-threaded program.
> + Copyright (C) 2017 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +/* This test runs several worker threads that perform the following three
> + steps in staggered synchronization with each other:
> + 1. initialization (iconv_open)
> + 2. conversion (iconv)
> + 3. cleanup (iconv_close)
> + Having several threads synchronously (using pthread_barrier_*) perform
> + these routines exercises iconv code that handles concurrent attempts to
> + initialize and/or read global iconv state (namely configuration). */
> +
> +#include <iconv.h>
> +#include <stdio.h>
> +#include <string.h>
> +
> +#include <support/support.h>
> +#include <support/xthread.h>
> +
> +#define TCOUNT 16
Need:
_Static_assert (TCOUNT % 2 == 0,
"thread count must be even, since we need two groups.");
> +#define CONV_INPUT "Hello, iconv!"
> +
> +
> +#define WORKER_FAIL(fmt) \
> + do { printf ("FAIL: thread %lx: " fmt ": %m\n", tidx); \
> + pthread_exit ((void *) (long int) 1); } while (0)
Florian already recommended TEST_VERIFY/TEST_VERIFY_EXIT here.
> +
> +
> +pthread_barrier_t sync;
> +pthread_barrier_t sync_half_pool;
> +
> +
> +/* Execute iconv_open, iconv and iconv_close in a synchronized way in
> + conjunction with other sibling worker threads. If any step fails, print
> + an error to stdout and return NULL to the main thread to indicate the
> + error. */
> +static void *
> +worker (void * arg)
> +{
> + long int tidx = (long int) arg;
OK.
> +
> + iconv_t cd;
> +
> + char ascii[] = CONV_INPUT;
> + char *inbufpos = ascii;
> + size_t inbytesleft = sizeof (CONV_INPUT);
> +
> + char *utf8 = xcalloc (sizeof (CONV_INPUT), 1);
> + char *outbufpos = utf8;
> + size_t outbytesleft = sizeof (CONV_INPUT);
> +
> + if (tidx < TCOUNT/2)
> + /* The first half of the worker thread pool synchronize together here,
> + then call iconv_open immediately after. */
> + xpthread_barrier_wait (&sync_half_pool);
OK.
> + else
> + /* The second half wait for the first half to finish iconv_open and
> + continue to the next barrier (before the call to iconv below). */
> + xpthread_barrier_wait (&sync);
OK.
> +
> + /* The above block of code staggers all subsequent pthread_barrier_wait
> + calls in a way that ensures a high chance of encountering these
> + combinations of concurrent iconv usage:
> + 1) concurrent calls to iconv_open,
> + 2) concurrent calls to iconv_open *and* iconv,
> + 3) concurrent calls to iconv,
> + 4) concurrent calls to iconv *and* iconv_close,
> + 5) concurrent calls to iconv_close. */
> +
> + cd = iconv_open ("UTF8", "ASCII");
> + if (cd == (iconv_t) -1)
> + WORKER_FAIL ("iconv_open");
> +
> + xpthread_barrier_wait (&sync);
> +
> + if (iconv (cd, &inbufpos, &inbytesleft, &outbufpos, &outbytesleft)
> + == (size_t) -1)
> + WORKER_FAIL ("iconv");
> + else
> + *outbufpos = '\0';
> +
> + xpthread_barrier_wait (&sync);
> +
> + if (iconv_close (cd))
> + WORKER_FAIL ("iconv_close");
> +
> + /* The next conditional barrier wait is needed because we staggered the
> + threads into two groups in the beginning and at this point, the second
> + half of worker threads are waiting for the first half to finish
> + iconv_close before they can executing the same: */
> + if (tidx < TCOUNT/2)
> + xpthread_barrier_wait (&sync);
> +
> + if (strncmp (utf8, CONV_INPUT, sizeof CONV_INPUT))
> + {
> + printf ("FAIL: thread %lx: invalid conversion output from iconv\n", tidx);
> + pthread_exit ((void *) (long int) 1);
> + }
> +
> + pthread_exit (NULL);
> +}
> +
> +
> +static int
> +do_test (void)
> +{
> + pthread_t thread[TCOUNT];
> + void * worker_output;
> + int retval = 0;
> + int i;
> +> + xpthread_barrier_init (&sync, NULL, TCOUNT);
> + xpthread_barrier_init (&sync_half_pool, NULL, TCOUNT/2);
> +
> + for (i = 0; i < TCOUNT; i++)
> + thread[i] = xpthread_create (NULL, worker, (void *) (long int) i);
> +
> + for (i = 0; i < TCOUNT; i++)
> + {
> + worker_output = xpthread_join (thread[i]);
> + if (worker_output != NULL)
> + retval = 1;
Florian already covered this, that we should just abort the test if any
thread has issues.
> + }
> +
> + xpthread_barrier_destroy (&sync);
> + xpthread_barrier_destroy (&sync_half_pool);
> +
> + return retval;
> +}
> +
> +#include <support/test-driver.c>
>
--
Cheers,
Carlos.