This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH v2] Add cancellation regression test for getpwuid_r.


On 12/05/2016 10:52 AM, Florian Weimer wrote:
> On 12/05/2016 04:42 PM, Carlos O'Donell wrote:
>> Add a regression test for the fix in
>> 312be3f9f5eab1643d7dcc7728c76d413d4f2640 which corrects
>> cancellation problems around getpwuid_r.
>> 
>> No regressions on x86_64. Test hangs without the aformentioned fix,
>> but passes after the fix.
>> 
>> OK to checkin?
> 
> The test is invalid.  getpwuid_r is not safe for cancellation because
> the NSS service modules might not be.

You are absolutely right, I was only considering the use of our plugins
which should be safe, but a developer system might have an entirely different
configuration.

I will posit that for glibc most functions that can block for unspecified
amounts of time should allow for deferred cancellation, but this might not,
see my v2 test which adds some explicit pthread_testcancel() after a bounded
number of iterations trying to cancel.

> I assume that your test targets nss_files, so you could call
> __nss_configure_lookup to restrict lookups to this service module.
> Even then, it is debatable whether this is expected to work at
> present.

Yes, nss_files is exactly what I want to test. I have adjusted the patch
to use __nss_configure_lookup.

> The test misuses _SC_GETPW_R_SIZE_MAX because it is not a maximum,
> only a hint.  It probably does not matter here, but we should not
> publish any code which uses the NSS *_r functions incorrectly.

v2
- Switched to always using nss_files via __nss_configure_lookup.
  to test cancellation changes to nss_files.
- Mention that _SC_GETPW_R_SIZE_MAX is a hint and add ERANGE loops
  to all lookups in order to ensure we get at least one non-ERANGE
  failing call into getpwuid_r.
- Assume that getpwuid_r might _not_ be a cancellation point and
  so run it for an arbitrary number of iterations in a tight
  cancellation loop before calling pthread_testcancel to make sure
  the loop exits in a bounded number of iterations.

Verified no regressions on x86_64.

OK to checkin?

2016-12-08  Carlos O'Donell  <carlos@redhat.com>

	* nss/Makefile [ifeq (yes,$(have-thread-library))]
	(tests): Add tst-cancel-getpwuid_r.
	* nss/tst-cancel-getpwuid_r.c: New file.

diff --git a/nss/Makefile b/nss/Makefile
index 1f016d9..9132e17 100644
--- a/nss/Makefile
+++ b/nss/Makefile
@@ -54,6 +54,12 @@ tests			= test-netdb tst-nss-test1 test-digits-dots \
 			  $(tests-static)
 xtests			= bug-erange
 
+# If we have a thread library then we can test cancellation against
+# some routines like getpwuid_r.
+ifeq (yes,$(have-thread-library))
+tests += tst-cancel-getpwuid_r
+endif
+
 # Specify rules for the nss_* modules.  We have some services.
 services		:= files db
 
@@ -125,3 +131,7 @@ $(objpfx)/libnss_test1.so$(libnss_test1.so-version): $(objpfx)/libnss_test1.so
 	$(make-link)
 endif
 $(objpfx)tst-nss-test1.out: $(objpfx)/libnss_test1.so$(libnss_test1.so-version)
+
+ifeq (yes,$(have-thread-library))
+$(objpfx)tst-cancel-getpwuid_r: $(shared-thread-library)
+endif
diff --git a/nss/tst-cancel-getpwuid_r.c b/nss/tst-cancel-getpwuid_r.c
new file mode 100644
index 0000000..43cf1fc
--- /dev/null
+++ b/nss/tst-cancel-getpwuid_r.c
@@ -0,0 +1,165 @@
+/* Test cancellation of getpwuid_r.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Test if cancellation of getpwuid_r incorrectly leaves internal
+   function state locked resulting in hang of subsequent calls to
+   getpwuid_r.  The main thread creates a second thread which will do
+   the calls to getpwuid_r.  A semaphore is used by the second thread to
+   signal to the main thread that it is as close as it can be to the
+   call site of getpwuid_r.  The goal of the semaphore is to avoid any
+   cancellable function calls between the sem_post and the call to
+   getpwuid_r.  The main thread then attempts to cancel the second
+   thread.  Without the fixes the cancellation happens at any number of
+   calls to cancellable functions in getpuid_r, but with the fix the
+   cancellation either does not happen or happens only at expected
+   points where the internal state is consistent.  We use an explicit
+   pthread_testcancel call to terminate the loop in a timely fashion
+   if the implementation does not have a cancellation point.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <pthread.h>
+#include <pwd.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <semaphore.h>
+#include <errno.h>
+
+static int do_test (void);
+#define TEST_FUNCTION do_test ()
+
+#include "../test-skeleton.c"
+
+sem_t started;
+char *wbuf;
+long wbufsz;
+
+void
+worker_free (void *arg)
+{
+  free (arg);
+}
+
+static void *
+worker( void *arg )
+{
+  int ret;
+  unsigned int iter = 0;
+  struct passwd pwbuf, *pw;
+  uid_t uid;
+  uid = geteuid();
+  /* Use a reasonable sized buffer.  Note that _SC_GETPW_R_SIZE_MAX is
+     just a hint and not any kind of maximum value.  */
+  wbufsz = sysconf(_SC_GETPW_R_SIZE_MAX);
+  if (wbufsz == -1)
+    wbufsz = 1024;
+  wbuf = xmalloc (wbufsz);
+  pthread_cleanup_push (worker_free, wbuf);
+  sem_post (&started);
+  while (1)
+    {
+      ret = getpwuid_r(uid, &pwbuf, wbuf, wbufsz, &pw);
+      iter++;
+      /* The call to getpwuid_r may not cancel so we need to test
+	 for cancellation after some number of iterations of the
+	 function.  Choose an arbitrary 100,000 iterations of running
+	 getpwuid_r in a tight cancellation loop before testing for
+	 cancellation.  */
+      if (iter > 100000)
+	pthread_testcancel ();
+      if (ret == ERANGE)
+	{
+	  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, NULL);
+	  /* Increase the buffer size.  */
+	  free (wbuf);
+	  wbufsz = wbufsz * 2;
+	  wbuf = xmalloc (wbufsz);
+	  pthread_setcancelstate (PTHREAD_CANCEL_ENABLE, NULL);
+	}
+    }
+  pthread_cleanup_pop (1);
+  return NULL;
+}
+
+static int
+do_test (void)
+{
+  int ret;
+  char *buf;
+  long bufsz;
+  void *retval;
+  struct passwd pwbuf, *pw;
+  pthread_t thread;
+  /* Configure the test to only use files. We control the files plugin
+     as part of glibc so we assert that it should be deferred
+     cancellation safe.  */
+  __nss_configure_lookup ("passwd", "files");
+  /* Use a reasonable sized buffer.  Note that  _SC_GETPW_R_SIZE_MAX is
+     just a hint and not any kind of maximum value.  */
+  bufsz = sysconf(_SC_GETPW_R_SIZE_MAX);
+  if (bufsz == -1)
+    bufsz = 1024;
+  buf = xmalloc (bufsz);
+  sem_init (&started, 0, 0);
+  pthread_create(&thread, NULL, worker, NULL);
+  do
+  {
+    ret = sem_wait (&started);
+    if (ret == -1 && errno != EINTR)
+      {
+        printf ("FAIL: Failed to wait for second thread to start.\n");
+	exit (EXIT_FAILURE);
+      }
+  }
+  while (ret != 0);
+  printf( "INFO: Cancelling thread\n" );
+  if ((ret = pthread_cancel(thread)) != 0)
+    {
+      printf ("FAIL: Failed to cancel thread. Returned %d\n", ret);
+      exit (EXIT_FAILURE);
+    }
+  printf( "INFO: Joining...\n");
+  pthread_join(thread, &retval);
+  if (retval != PTHREAD_CANCELED)
+    {
+      printf ("FAIL: Thread was not cancelled.\n");
+      exit (EXIT_FAILURE);
+    }
+  printf( "INFO: Joined, trying getpwuid_r call\n" );
+  /* Before the fix in 312be3f9f5eab1643d7dcc7728c76d413d4f2640 for this
+     issue the cancellation point could happen in any number of internal
+     calls, and therefore locks would be left held and the following
+     call to getpwuid_r would block and the test would time out.  */
+  do
+    {
+      ret = getpwuid_r (geteuid(), &pwbuf, buf, bufsz, &pw);
+      if (ret == ERANGE)
+	{
+	  /* Increase the buffer size.  */
+	  free (buf);
+	  bufsz = bufsz * 2;
+	  buf = xmalloc (bufsz);
+	}
+    }
+  while (ret == ERANGE);
+  free (buf);
+  printf ("INFO: Previoulsy we would never get here\n");
+  printf ("PASS: Cancelled getpwuid_r successfully"
+	  " and called it again without blocking.\n");
+  return 0;
+}
---


-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]