Re: froggy/archer -- 2009-02-24

Daniel Jacobowitz wrote:
> I assume we're talking primarily about linux-nat.c.
> The places which call waitpid (or my_waitpid) are:

First, a bit more of how froggy works.   Froggy has two components, a
user-space library and a kernel module. 

A big part of utrace are what are called report_* callbacks--hooks that
if they're enabled get called when various things occur--and a lot of
the froggy module deals with those hooks.

The library communicates with the the module via ioctl()s and blocking
read()s on /sys/kernel/debug/froggy.  The ioctl()s are the mechanism
used for ptrace()-like operations; the read()s are how events are
reported.  When froggy is initialised, it spawns a thread that loops on
a read().  In the module, that read is blocked on a wait.  When an event
of interest occurs, it queues a packet, wakes up the thread, letting the
read() return.  In the froggy lib, the returned packet is parsed and any
appropriate user-space callbacks, more or less corresponding to the
kernel/utrace report_* callbacks, are called.

The key thing here is that in froggy there's exactly one waitpid()-like
thing--that blocking read().  In gdb, there are multiple waipid()s and
different things happen after each of the different waitpid()s--in
effect, each waipid() occurs in it's own context.  Due to the
centralised nature of the froggy event reporting, that context is lost,
so there's not likely just one appropriate user-space callback that will
work.  Further, as noted below, the whole point of some waitpid()s to
block for various reasons.

Regarding the specific instances:

>   * get_pending_events, which is using it to collect all events that
>     have happened asynchronously - using WNOHANG.

This one actually maps to froggy fairly well, just forwarding events to
linux_nat_event_pipe_push().   The problem--and I don't know for sure if
it really is one--is that since in froggy the events are reported
through the froggy response thread, linux_nat_event_pipe_push() will be
called asynchronously, which I don't have any idea if that does what's

>   * linux_test_for_tracefork, which is just used at startup to
>     investigate capabilities of the host kernel.

This is irrelevant--by definition, if froggy is being used, the
capabilities of the kernel are known.

>   * linux_child_follow_fork.  This one does have to block, it's waiting
>     for the parent to stop as vfork returns.

The whole purpose of this use is to block the thread--see above.

>   * linux_nat_post_attach_wait, which is just trying to quiesce after
>     attach.

Attaching processes works very differently in froggy/utrace, so I'm not
sure this is relevant.

>   * linux_handle_extended_wait.  This is another two-processes case;
>     we are waiting for the child to quiesce because we can not handle
>     the fork event reported by the parent until this happens.

Again, this kind of thing is (it looks like) handled internally in froggy.

>   * kill_wait_callback.  Another ptrace wart; we're just waiting for
>     killed processes to go away.  If we got async notification of
>     that, we could easily sleep here; the order doesn't matter.

Again, killing in froggy will probably have an option flag to block
until the killed process is really, truly, dead, or partially dead, or
whatever the user wants.

>   * wait_lwp is also only used for quiescing, after stopping a thread.

froggy_quiesce_pid() optionally blocks until the thread quiesces.

> And of course linux_nat_wait.  This is the only really interesting
> one; notice that in async mode, it never calls waitpid, just checks
> the asynchronous queue.

This looks like another one of those context-dependent things.

> Of course, I don't know what you're trying to achieve with froggy
> here.  But it sounds like it's doing basically the same thing as
> the queued_waitpid / linux_nat_event_pipe_* mechanism; that is a
> layer which transforms waitpid results into an async stream.

That's a big part of it, but the block-for-whatever-reason stuff has to
work too and that's mostly the part I can't figure out how to do--that
and the context thing.

Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
      -- Robert McCloskey

