Trouble with waitpid and Rust

Michael Neumann mneumann at ntecs.de
Tue Jul 11 14:29:58 PDT 2017



On 07/11/17 18:44, Matthew Dillon wrote:
> If test_wait() and test_fork_and_waitpid() are running simultaneously
> from different threads, then test_wait() can reap the pid that
> test_fork_and_waitpid() is explicitly waiting on.  Because test_wait()
> is using a generic non-specific wait().>
> Is that possible?

Yes! Well, test_fork_and_waitpid() is running in one thread, test_wait()
in another:

test_fork_and_waitpid():
  fork()
  if (child) do nothing
  if (parent) waitpid(child);

test_wait():

  fork()
  if (child) do nothing
  if (parent) wait();

So yes, the wait() from test_wait() could reap the child process created
by test_fork_and_waitpid(). Then waitpid(child) in
test_fork_and_waitpid() would wait for an already dead child and return
ECHILD.

And actually I am getting this:

---- test_unistd::test_wait stdout ----
	thread 'test_unistd::test_wait' panicked at 'assertion failed: `(left
== right)` (left: `Ok(Exited(Pid(1156), 0))`, right:
`Ok(Exited(Pid(1157), 0))`)', test/test_unistd.rs:53
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- test_unistd::test_fork_and_waitpid stdout ----
	thread 'test_unistd::test_fork_and_waitpid' panicked at 'Error: waitpid
Failed', test/test_unistd.rs:35

test_wait() catches process 1157 instead of 1156, while the other
reports that waitpid failed.

The test cases definitively need to be run sequentially! No DragonFly
bug! Thanks!

Regards,

  Michael




> 
> -Matt
> 
> On Tue, Jul 11, 2017 at 9:26 AM, Matthew Dillon <dillon at backplane.com
> <mailto:dillon at backplane.com>> wrote:
> 
>     Er, what I meant to say, the thread doing the wait4(-1, ...) is
>     returning the pid that the second thread doing the wait4(pid, ...)
>     was explicitly waiting for.  So that second thread properly returns
>     ECHILD.
> 
>     So the question is who is doing the wait4(-1, ...)
> 
>     -Matt
> 
>     On Tue, Jul 11, 2017 at 9:25 AM, Matthew Dillon
>     <dillon at backplane.com <mailto:dillon at backplane.com>> wrote:
> 
>         Ok, I'm not sure this is a bug in DragonFly.  When I ktrace -i
>         the test program threaded, another thread is doing a wait4(-1,
>         ...) at the same as the thread doing wait4(specific_pid, ...). 
>         The specific pid is being repeated by the thread doing the
>         wait4(-1, ...), so the thread doing the wait4(specific_pid, ...)
>         doe sproperly return SIGCHLD in that situation.
> 
>         -Matt
> 
>         On Tue, Jul 11, 2017 at 9:13 AM, Matthew Dillon
>         <dillon at backplane.com <mailto:dillon at backplane.com>> wrote:
> 
>             I assume you meant ECHILD?  That would definitely be a bug
>             if there are still children present after one exits.
> 
>             -Matt
> 
>             On Mon, Jul 10, 2017 at 8:29 AM, Imre Vadász
>             <imrevdsz at gmail.com <mailto:imrevdsz at gmail.com>> wrote:
> 
>                 Ok, not related to SIGCHLD or anything like that. What I
>                 see when running that binary, is that 2 threads are
>                 blocking in wait() and when a child exit()s, one wait
>                 successfully returns. But the other one erronously
>                 reports SIGCHLD instead of continuing to block as would
>                 be expected. This definitely looks like a bug in the
>                 wait4 syscall, judging from the trace I got with "ktrace
>                 -i ./test-nix --test-threads 4" here.
> 
> 
>                 On Monday, July 10, 2017, Imre Vadász
>                 <imrevdsz at gmail.com <mailto:imrevdsz at gmail.com>> wrote:
> 
>                     Hi,
>                     Did you verify that the child isn't just exiting
>                     before the parent calls wait(), and getting reaped
>                     by a signal handler for SIGCHLD? This should be easy
>                     to verify by running the binary with ktrace. In that
>                     case getting ECHILD from waitpid() would be
>                     perfectly fine and could be treated as success.
>                     Regards, Imre
> 
>                     On Monday, July 10, 2017, Michael Neumann
>                     <mneumann at ntecs.de> wrote:
> 
> 
> 
>                         On 07/10/17 15:40, Michael Neumann wrote:
>                         > Hi,
>                         >
>                         > I am trying to port some Rust libraries to
>                         DragonFly, specifically the
>                         > [nix] crate to access UN*X APIs from Rust.
> 
>                         I think it's related to the problems that can
>                         occur when fork()ing a
>                         multi-threaded program, as described here:
> 
>                                
>                         http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
>                         <http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them>
> 
>                         Furthermore, I think this is similar to the
>                         "cargo" issue I am seeing,
>                         when running multi-threaded (it is forking and
>                         execve "rustc" and
>                         probably doing something in between... sometimes
>                         it hangs).
> 
> 
>                         Regards,
> 
>                           Michael
> 
> 
> 
> 
> 



More information about the Kernel mailing list