This is the mail archive of the libc-alpha@sourceware.cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [jbuhler@cs.washington.edu] libc/1273: system() blocks foreverwhen called from a threaded program


On 27 Aug 1999, Andreas Jaeger wrote:
> We've received the appended bug report.  Anybody any ideas what's
> wrong here?
> [problem with hang in system() if I create a thread first]

I've traced at least part of the bad behavior in system().  The
culprit is the new execve code introduced for Linux in 2.1.2pre3,
which kills all threads and destroys the thread manager state before
starting a new program.

system() spawns a child process which execve's the shell, but it uses
vfork rather than fork to do so.  vfork (which *is* implemented by the
Linux 2.2 kernels, despite what the libc info docs say) leaves the
child process in the parent's address space, so when execve calls
pthread_kill_other_threads_np(), all the threads spawned from the
parent die, and the thread manager state is lost.  I strongly doubt
that this was the intended behavior.  You should review all uses
of vfork + exec in libc for similar problems with threads, and/or
consider whether you really want to break vfork semantics for all
threaded programs.

That bug aside, I still don't understand why the child process appears
to hang.  As best I can tell, the situation that leads to the hang is
the following:

 1. Parent calls vfork, which leaves it suspended in D state until child
    finishes.

 2. Child calls execve, which calls pthread_kill_other_threads_np,
    which calls pthread_exit_process, which puts the child to sleep
    while the thread manager is cleaning up.

 3. All spawned threads execept the thread manager die and are reaped.

 4. The thread manager dies and is left as a zombie.  It doesn't seem
    to wake up the child of the vfork, so the actual execve syscall
    never happens.

By pressing ^C, I send a SIGINT to the child, which wakes up from its
sleep and terminates (without executing the execve system call!)
because it has no SIGINT handler.  The parent then wakes up, does
waitpid for the child, and finds that the status indicates an exit on
signal 2 (SIGINT) -- hence my earlier observation that system()
returns '2'.

What mystifies me is why the thread manager doesn't wake up the child
thread, and why the bad behavior occurs with vfork but not with
ordinary fork.  I guess this is some subtlety in the pthread library.

I've appended a small test program that illustrates the hang directly
without reference to either system() or the shell.

                                                    Jeremy Buhler
                                                    jbuhler@cs.washington.edu


/* Illustrates the fact that vfork + exec does really bad
 * things to threaded code.
 */
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <pthread.h>

#include <sys/types.h>
#include <sys/wait.h>

extern void __pthread_kill_other_threads_np(void);

void *athread(void *arg)
{
  while (1) {} /* loop forever */
  pthread_exit((void *) 0);
}

int main(void)
{
  pthread_t tid;
  pid_t pid;
  
  struct sigaction sa, intr;
  
  pthread_create(&tid, NULL, athread, NULL); /* remove call -> system() OK */
  
  sa.sa_handler = SIG_IGN;
  sa.sa_flags = 0;
  sigemptyset (&sa.sa_mask);

  if (sigaction (SIGINT, &sa, &intr) < 0)
    {
      fprintf(stderr, "Ignoring SIGINT failed!\n");
      exit(1);
    }

  pid = vfork(); /* suspends parent until child exits */
  
  if (pid == 0) /* child process */
    {
      const char *new_argv[2];
      new_argv[0] = "true";
      new_argv[1] = "NULL";
      
      if (sigaction (SIGINT, &intr, NULL) < 0)
	_exit(42);
      
      /* ALERT:
       * execve in child kills all threads and destroys thread mgr state
       * in parent when vfork is used! This makes system() do bad things to
       * threaded programs, as it uses vfork followed by execve in child.
       * This behavior is new in glibc 2.1.2pre3.
       *
       * See Ulrich Drepper's libc ChangeLog entry of 1999/08/19:
       *
       * sysdeps/unix/sysv/linux/execve.c: New file.  This version terminates
       * all threads [PR libc/1223].
       *
       * Solution: switch to regular fork in system()?
       */
      
      /* NOTE: observed hang occurs here
       *   vfork child is asleep (suspended inside pthread_exit_process?)
       *   vfork parent is in D state, suspended inside vfork syscall
       *   athread is dead and reaped
       *   thread manager is a zombie
       *
       * The child can be killed with ^C but doesn't actually invoke
       * the execve system call.
       *
       * Does this behavior indicate a bug independent of the vfork
       * problem described above?
       */
      execv("/bin/true", (char * const *) new_argv);
      _exit (127);
    }
  else if (pid < 0)
    {
      fprintf(stderr, "vfork failed!\n");
      exit(1);
    }
  else /* parent process */
    {
      int status;
      
      if (waitpid(pid, &status, 0) != pid)
	fprintf(stderr, "waitpid failed\n");
      else if (WIFEXITED(status))
	printf("child exited with status %d\n", WEXITSTATUS(status));
      else if (WIFSIGNALED(status))
	printf("child exited on uncaught signal %d\n", WTERMSIG(status));
      else
	printf("child exited abnormally\n");
    }
  
  return 0;
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]