This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] posix: Fix improper assert in Linux posix_spawn (BZ#22273)



> Il giorno 20 ott 2017, alle ore 21:55, Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> ha scritto:
> 
> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
> 
>> As noted by Florian Weimer, current Linux posix_spawn implementation
>> can trigger an assert if the auxiliary process is terminated before
>> actually setting the err member:
>> 
>>    340   /* Child must set args.err to something non-negative - we rely on
>>    341      the parent and child sharing VM.  */
>>    342   args.err = -1;
>>    [...]
>>    362   new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size,
>>    363                    CLONE_VM | CLONE_VFORK | SIGCHLD, &args);
>>    364
>>    365   if (new_pid > 0)
>>    366     {
>>    367       ec = args.err;
>>    368       assert (ec >= 0);
>> 
>> Another possible issue is killing the child between setting the err and
>> actually calling execve.  In this case the process will not ran, but
>> posix_spawn also will not report any error:
>> 
>>    269
>>    270   args->err = 0;
>>    271   args->exec (args->file, args->argv, args->envp);
>> 
>> As suggested by Andreas Schwab, this patch removes the faulty assert
>> and also handles any signal that happens before fork and execve as the
>> spawn was successful (and thus relaying the handling to the caller to
>> figure this out).  Different than Florian, I can not see why using
>> atomics to set err would help here, essentially the code runs
>> sequentially (due CLONE_VFORK) and I think it would not be legal the
>> compiler evaluate ec without checking for new_pid result (thus there
>> is no need to compiler barrier).
>> 
>> Summarizing the possible scenarios on posix_spawn execution, we
>> have:
>> 
>>  1. For default case with a success execution, args.err will be 0, pid
>>     will not be collected and it will be reported to caller.
>> 
>>  2. For default failure case, args.err will be positive and the it will
>>     be collected by the waitpid.  An error will be reported to the
>>     caller.
>> 
>>  3. For the unlikely case where the process was terminated and not
>>     collected by a caller signal handler, it will be reported as succeful
>>     execution and not be collected by posix_spawn (since args.err will
>>     be 0). The caller will need to actually handle this case.
>> 
>>  4. For the unlikely case where the process was terminated and collected
>>     by caller we have 3 other possible scenarios:
>> 
>>     4.1. The auxiliary process was terminated with args.err equal to 0:
>>      it will handled as 1. (so it does not matter if we hit the pid
>>          reuse race since we won't possible collect an unexpected
>>          process).
>> 
>>     4.2. The auxiliary process was terminated after execve (due a failure
>>          in calling it) and before setting args.err to -1: it will also
>>          be handle as 1. but with the issue of not be able to report the
>>          caller a possible execve failures.
>> 
>>     4.3. The auxiliary process was terminated after args.err is set to -1:
>>          this is the case where it will be possible to hit the pid reuse
>>          case where we will need to collected the auxiliary pid but we
>>          can not be sure if it will be expected one.  I think for this
>>          case we need to actually change waitpid to use WNOHANG to avoid
>>          hanging indefinitely on the call and report an error to caller
>>          since we can't differentiate between a default failure as 2.
>>          and a possible pid reuse race issue.
>> 
>> Checked on x86_64-linux-gnu.
> 
> posix/tst-spawn and posix/tst-spawn2 started to fail after this patch.
> Reproduced on aarch64 [1], ppc [2], ppc64 [3] and ppc64le.
> 
> posix/tst-spawn.out:
> error: tst-spawn.c:257: not true: WEXITSTATUS (status) == 0
> error: 1 test failures
> 
> posix/tst-spawn2.out:
> error: tst-spawn2.c:56: waitpid: No such file or directory)
> error: 1 test failures
> 
> [1] http://glibc-buildbot.reserved-bit.com/builders/glibc-aarch64-linux/builds/760
> [2] http://glibc-buildbot.reserved-bit.com/builders/glibc-ppc-linux/builds/152
> [3] http://glibc-buildbot.reserved-bit.com/builders/glibc-power8-linux/builds/728
> 
> -- 
> Tulio Magno

Right, I only indeed checked on x86_64. I will check this out.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]