This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Unwarranted assumption in tst-waitid, or a kernel bug?

On Tue, Sep 21, 2010 at 8:43 PM, Oleg Nesterov <> wrote:
> On 09/21, Roland McGrath wrote:
>> As far as I can tell, Linux has never had a guarantee like this. ?From a
>> cursory look at the code in a few versions, I think the differences
>> you've seen between kernel versions are due to scheduling changes, not
>> that the actual local constraints in the exit/SIGCHLD/wait code paths
>> have changed at all.
> Agreed.
> Paul, I guess that this test-case "fails" after kill(pid, SIGSTOP),
> right?

Yes, the failure is:

  missing SIGCHLD on stopped

And is coming from line 358 in posix/tst-waitid.c:

   334    expecting_sigchld = 1;
   335    if (kill (pid, SIGSTOP) != 0)
   336      {
   337        printf ("kill (%d, SIGSTOP): %m\n", pid);
   338        RETURN (EXIT_FAILURE);
   339      }
   340    pid_t wpid = waitpid (pid, &fail, WUNTRACED);
   341    if (wpid < 0)
   342      {
   343        printf ("waitpid WUNTRACED on stopped: %m\n");
   344        RETURN (EXIT_FAILURE);
   345      }
   346    else if (wpid != pid)
   347      {
   348        printf ("waitpid WUNTRACED on stopped returned %d != %d
(status %x)\n",
   349                wpid, pid, fail);
   350        RETURN (EXIT_FAILURE);
   351      }
   352    else if (!WIFSTOPPED (fail) || WIFSIGNALED (fail) || WIFEXITED (fail)
   353             || WIFCONTINUED (fail) || WSTOPSIG (fail) != SIGSTOP)
   354      {
   355        printf ("waitpid WUNTRACED on stopped: status %x\n", fail);
   356        RETURN (EXIT_FAILURE);
   357      }
   358    CHECK_SIGCHLD ("stopped", CLD_STOPPED, SIGSTOP);

> I am a bit surprised it never fails on 2.6.18. I think you can add
> a small delay into finish_stop() (before it takes tasklist_lock),
> then I believe it should fail the same way.

You are probably in better position to confirm this -- I don't usually
build kernels :-)

Anyway, assuming we all agree the assumption is unwarranted, what is
the correct way to fix tst-waitid.c ?

And while I have your attention, is it possible for the same problem
to manifest itself in rt/tst-mqueue5.c ?

Here the failure is "missing SIGRTMIN" at line 120:

   114    /* Parent calls mqsend (q), which should trigger notification.  */
   116    (void) pthread_barrier_wait (b3);
   118    if (rtmin_cnt != 2)
   119      {
   120        puts ("SIGRTMIN signal in child did not arrive");
   121        result = 1;
   122      }

(I have not yet tried to produce a small test case for this, but the
fact that signal delivery also appears to be delayed here makes me
think that it might be the same issue.)

Paul Pluzhnikov

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]