This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v8 6/7] Remote fork catch


On 4/23/2015 7:19 AM, Pedro Alves wrote:
> On 04/17/2015 09:58 PM, Don Breazeal wrote:
> 
>>>> I think there's a race here, in non-stop mode.  Consider:
>>>>
>>>>  #1 - process forks just before gdb starts fetching the remote thread
>>>>       list.
>>>>  #2 - gdbserver adds the fork child  its thread list.
>>>>  #3 - gdbserver queues the fork event, sends vStopped notification
>>>>  #4 - gdb/remote_update_thread_list pulls the thread list
>>>>  #5 - we're now in remove_new_fork_child, but we don't know
>>>>       about the fork event yet.  It's still pending in the vStopped
>>>>       queue.
>>>>
>>>> So I think that we need to make remote_update_thread_list do,
>>>> in this order:
>>>>
>>>>  #1 - fetch the remote thread list
>>>>  #2 - fetch the pending vStopped notifications
>>>>         (remote_notif_get_pending_events)
>>>>  #3 - call remove_new_fork_child
>>>>  #4 - add threads we don't know about yet to our list.
>>>>
>>>> and make remove_new_fork_child also peek at the
>>>> pending vStopped events queue (and in the future at
>>>> any other layers of pending events in the core side.)
>> I think all that was necessary here was to insert a call to
>> remote_notif_get_pending_events before the call to remove_new_fork_child,
>> since remote_notif_get_pending_events ultimately calls
>> remote_parse_stop_reply on the responses from the target, handling the
>> pending events like normal events.  Or am I missing something?  I 
>> inferred from your comment above that there might be more to it.
> 
> It fetches the vStopped notifications out of the remote,
> and parses them, but the events are left in the stop reply
> queue, they're not handled.  They're only handled once the
> core next calls target_wait -> remote_wait_ns -> queued_stop_reply.
> 
> So we need to peek into the stop reply queue, and check if we have
> pending fork events.  We have peek_stop_reply for a similar use,
> but it's not the same.
> 

Hi Pedro,

I've reimplemented the mechanisms that (a) ignore new threads
that are as-yet-unreported fork children, and (b) kill as-yet-
unreported fork children when killing the fork parent.  These both
now check the stop reply queue for unreported fork events.  I changed
the names of the new functions to use 'ignore' instead of 'remove'
to more accurately reflect what they are doing, and moved
remote_update_thread_list to avoid having to add prototypes and
forward declarations to satisfy all the dependences.

I tested killing the unreported fork child manually on x86_64 Ubuntu.
I just relied on usual regression testing for ignoring the new
threads (knowing that things go bad if they are not ignored).

I have some concerns about my test results, though.  Over the past
few weeks I've been seeing more intermittent failures than I expect,
on both the mainline and my branch.  Sometimes I get clean test runs,
other times not.  Is this something others have been seeing, or is it
just me?

I know that some of these tests are problematic (random-signal.exp),
but others I haven't been aware of.  The failures I've seen include
(but aren't limited to):

gdb.base/random-signal.exp
gdb.mi/mi-nsmoribund.exp
gdb.threads/attach-many-short-lived-threads.exp
gdb.threads/interrupted-hand-call.exp
gdb.threads/thread-unwindonsignal.exp
gdb.trace/tspeed.exp

I've also seen failures in 'random' tests due to timeouts on the
qSupported packet, like this:

target remote localhost:3238^M
Remote debugging using localhost:3238^M
Ignoring packet error, continuing...^M
warning: unrecognized item "timeout" in "qSupported" response^M

I was particularly concerned about this since I had made changes to
qSupported, but I found that it often times out for me on the
mainline as well.  I tried bumping the timeout up to 5 seconds for
the qSupported packet and still saw the error.  I haven't identified
a root cause.

Most of these issues I've been able to reproduce on the mainline by
just running the test over and over in a loop until it fails.

So, all that said, I believe that my changes aren't introducing any
obvious regressions, but I'd feel a lot better about it if I could
get clean test runs more reliably.

Thanks
--Don

This patch implements catchpoints for fork events on extended-remote
Linux targets.

Implementation appeared to be straightforward, requiring four new functions
in remote.c to implement insert/remove of fork/vfork catchpoints.  These
functions are essentially stubs that just return 0 ('success') if the
required features are enabled.  If the fork events are being reported, then
catchpoints are set and hit.

However, there are some extra issues that arise with catchpoints.

1) Thread creation reporting -- fork catchpoints are hit before the
   follow_fork has been completed.  When stopped at a fork catchpoint
   in the native implementation, the new process is not 'reported'
   until after the follow is done.  It doesn't show up in the inferiors
   list or the threads list.  However, in the gdbserver case, an
   'info threads' while stopped at a fork catchpoint will retrieve the
   new thread info from the target and add it to GDB's data structures,
   prior to the follow operations.  Because of this premature report,
   things on the GDB side eventually get very confused.

   So in remote.c:remote_update_thread_list, we call a new function,
   ignore_new_fork_children, to see if there are any pending fork
   parent thread(s).  If there are we remove the related fork child
   thread(s) from the thread list sent by the target.

2) Kill process before fork is followed -- on the native side in
   linux-nat.c:linux_nat_kill, there is some code to handle the case where
   a fork has occurred but follow_fork hasn't been called yet.  It does
   this by using the last status to determine if a follow is pending, and
   if it is, to kill the child task.  The use of last_status is fragile
   in situations like non-stop mode where other events may have occurred
   after the fork event.  This patch identifies a fork parent
   in remote.c:extended_remote_kill in a way similar to that used in
   thread creation reporting above.  Any new fork child threads that are
   found are killed as well as the fork parent.

Tested on x64 Ubuntu Lucid, native, remote, extended-remote.  Tested the
case of killing the forking process before the fork has been followed
manually.

Thanks
--Don

gdb/
2015-05-04  Don Breazeal  <donb@codesourcery.com>

	* gdb/remote.c (remote_insert_fork_catchpoint): New function.
	(remote_remove_fork_catchpoint): New function.
	(remote_insert_vfork_catchpoint): New function.
	(remote_remove_vfork_catchpoint): New function.
	(remote_update_thread_list): Call ignore_new_fork_children.
	Move below ignore_new_fork_children in file.
	(is_pending_fork_parent): New function.
	(ignore_new_fork_child): New function.
	(ignore_child_of_pending_fork): New function.
	(ignore_new_fork_children): New function.
	(kill_child_of_pending_fork): New function.
	(kill_new_fork_children): New function.
	(extended_remote_kill): Call kill_new_fork_children.
	(init_extended_remote_ops): Initialize target vector with
	new fork catchpoint functions.

---
 gdb/remote.c |  413 ++++++++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 317 insertions(+), 96 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index 8a9eb9b..bc55d65 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -86,6 +86,7 @@ static long target_buf_size;
 enum { REMOTE_ALIGN_WRITES = 16 };
 
 /* Prototypes for local functions.  */
+static void remote_update_thread_list (struct target_ops *ops);
 static void async_cleanup_sigint_signal_handler (void *dummy);
 static int getpkt_sane (char **buf, long *sizeof_buf, int forever);
 static int getpkt_or_notif_sane (char **buf, long *sizeof_buf,
@@ -104,6 +105,10 @@ static void remote_open_1 (const char *, int, struct target_ops *,
 
 static void remote_close (struct target_ops *self);
 
+struct remote_state;
+
+static int remote_vkill (int pid, struct remote_state *rs);
+
 static void remote_mourn (struct target_ops *ops);
 
 static void extended_remote_restart (void);
@@ -181,7 +186,6 @@ static ptid_t read_ptid (char *buf, char **obuf);
 
 static void remote_set_permissions (struct target_ops *self);
 
-struct remote_state;
 static int remote_get_trace_status (struct target_ops *self,
 				    struct trace_status *ts);
 
@@ -1478,6 +1482,46 @@ remote_vfork_event_p (struct remote_state *rs)
   return packet_support (PACKET_vfork_event_feature) == PACKET_ENABLE;
 }
 
+/* Insert fork catchpoint target routine.  If fork events are enabled
+   then return success, nothing more to do.  */
+
+static int
+remote_insert_fork_catchpoint (struct target_ops *ops, int pid)
+{
+  struct remote_state *rs = get_remote_state ();
+
+  return !remote_fork_event_p (rs);
+}
+
+/* Remove fork catchpoint target routine.  Nothing to do, just
+   return success.  */
+
+static int
+remote_remove_fork_catchpoint (struct target_ops *ops, int pid)
+{
+  return 0;
+}
+
+/* Insert vfork catchpoint target routine.  If vfork events are enabled
+   then return success, nothing more to do.  */
+
+static int
+remote_insert_vfork_catchpoint (struct target_ops *ops, int pid)
+{
+  struct remote_state *rs = get_remote_state ();
+
+  return !remote_vfork_event_p (rs);
+}
+
+/* Remove vfork catchpoint target routine.  Nothing to do, just
+   return success.  */
+
+static int
+remote_remove_vfork_catchpoint (struct target_ops *ops, int pid)
+{
+  return 0;
+}
+
 /* Tokens for use by the asynchronous signal handlers for SIGINT.  */
 static struct async_signal_handler *async_sigint_remote_twice_token;
 static struct async_signal_handler *async_sigint_remote_token;
@@ -2824,101 +2868,6 @@ remote_get_threads_with_qthreadinfo (struct target_ops *ops,
   return 0;
 }
 
-/* Implement the to_update_thread_list function for the remote
-   targets.  */
-
-static void
-remote_update_thread_list (struct target_ops *ops)
-{
-  struct remote_state *rs = get_remote_state ();
-  struct threads_listing_context context;
-  struct cleanup *old_chain;
-  int got_list = 0;
-
-  context.items = NULL;
-  old_chain = make_cleanup (clear_threads_listing_context, &context);
-
-  /* We have a few different mechanisms to fetch the thread list.  Try
-     them all, starting with the most preferred one first, falling
-     back to older methods.  */
-  if (remote_get_threads_with_qxfer (ops, &context)
-      || remote_get_threads_with_qthreadinfo (ops, &context)
-      || remote_get_threads_with_ql (ops, &context))
-    {
-      int i;
-      struct thread_item *item;
-      struct thread_info *tp, *tmp;
-
-      got_list = 1;
-
-      if (VEC_empty (thread_item_t, context.items)
-	  && remote_thread_always_alive (ops, inferior_ptid))
-	{
-	  /* Some targets don't really support threads, but still
-	     reply an (empty) thread list in response to the thread
-	     listing packets, instead of replying "packet not
-	     supported".  Exit early so we don't delete the main
-	     thread.  */
-	  do_cleanups (old_chain);
-	  return;
-	}
-
-      /* CONTEXT now holds the current thread list on the remote
-	 target end.  Delete GDB-side threads no longer found on the
-	 target.  */
-      ALL_THREADS_SAFE (tp, tmp)
-        {
-	  for (i = 0;
-	       VEC_iterate (thread_item_t, context.items, i, item);
-	       ++i)
-	    {
-	      if (ptid_equal (item->ptid, tp->ptid))
-		break;
-	    }
-
-	  if (i == VEC_length (thread_item_t, context.items))
-	    {
-	      /* Not found.  */
-	      delete_thread (tp->ptid);
-	    }
-        }
-
-      /* And now add threads we don't know about yet to our list.  */
-      for (i = 0;
-	   VEC_iterate (thread_item_t, context.items, i, item);
-	   ++i)
-	{
-	  if (!ptid_equal (item->ptid, null_ptid))
-	    {
-	      struct private_thread_info *info;
-	      /* In non-stop mode, we assume new found threads are
-		 running until proven otherwise with a stop reply.  In
-		 all-stop, we can only get here if all threads are
-		 stopped.  */
-	      int running = non_stop ? 1 : 0;
-
-	      remote_notice_new_inferior (item->ptid, running);
-
-	      info = demand_private_info (item->ptid);
-	      info->core = item->core;
-	      info->extra = item->extra;
-	      item->extra = NULL;
-	    }
-	}
-    }
-
-  if (!got_list)
-    {
-      /* If no thread listing method is supported, then query whether
-	 each known thread is alive, one by one, with the T packet.
-	 If the target doesn't support threads at all, then this is a
-	 no-op.  See remote_thread_alive.  */
-      prune_threads ();
-    }
-
-  do_cleanups (old_chain);
-}
-
 /*
  * Collect a descriptive string about the given thread.
  * The target may say anything it wants to about the thread
@@ -5427,6 +5376,202 @@ struct queue_iter_param
   struct stop_reply *output;
 };
 
+/* Determine if THREAD is a pending fork parent thread.  ARG contains
+   the pid of the process that owns the threads we want to check, or
+   -1 if we want to check all threads.  */
+
+static int
+is_pending_fork_parent (struct target_waitstatus *ws, int event_pid,
+			ptid_t thread_ptid)
+{
+  if (ws->kind == TARGET_WAITKIND_FORKED
+      || ws->kind == TARGET_WAITKIND_VFORKED)
+    {
+      if (event_pid == -1 || event_pid == ptid_get_pid (thread_ptid))
+	return 1;
+    }
+
+  return 0;
+}
+
+/* Remove the thread specified as the related_pid field of WS
+   from the CONTEXT list.  */
+
+static void
+ignore_new_fork_child (struct target_waitstatus *ws,
+		       struct threads_listing_context *context)
+{
+  struct thread_item *item;
+  int i;
+  ptid_t child_ptid = ws->value.related_pid;
+
+  for (i = 0; VEC_iterate (thread_item_t, context->items, i, item); ++i)
+    {
+      if (ptid_equal (item->ptid, child_ptid))
+	{
+	  VEC_ordered_remove (thread_item_t, context->items, i);
+	  break;
+	}
+    }
+}
+
+/* Check whether EVENT is a fork event, and if it is, remove the
+   fork child from the context list passed in DATA.  */
+
+static int
+ignore_child_of_pending_fork (QUEUE (stop_reply_p) *q,
+			      QUEUE_ITER (stop_reply_p) *iter,
+			      stop_reply_p event,
+			      void *data)
+{
+  struct queue_iter_param *param = data;
+  struct threads_listing_context *context = param->input;
+
+  if (event->ws.kind == TARGET_WAITKIND_FORKED
+      || event->ws.kind == TARGET_WAITKIND_VFORKED)
+    {
+      ignore_new_fork_child (&event->ws, context);
+    }
+
+  return 1;
+}
+
+/* If CONTEXT contains any fork child threads that have not been
+   reported yet, remove them from the CONTEXT list.  If such a
+   thread exists it is because we are stopped at a fork catchpoint
+   and have not yet called follow_fork, which will set up the
+   host-side data structures for the new process.  */
+
+static void
+ignore_new_fork_children (struct threads_listing_context *context)
+{
+  struct thread_info * thread;
+  int pid = -1;
+  struct notif_client *notif = &notif_client_stop;
+  struct queue_iter_param param;
+
+  /* For any threads stopped at a fork event, delete the corresponding
+     fork child threads from the CONTEXT list.  */
+  ALL_NON_EXITED_THREADS (thread)
+    {
+      struct target_waitstatus *ws = &thread->pending_follow;
+
+      if (is_pending_fork_parent (ws, pid, thread->ptid))
+	{
+	  ignore_new_fork_child (ws, context);
+	}
+    }
+
+  /* Check for any pending fork events (not reported or processed yet)
+     in process PID and remove those fork child threads from the
+     CONTEXT list as well.  */
+  remote_notif_get_pending_events (notif);
+  param.input = context;
+  param.output = NULL;
+  QUEUE_iterate (stop_reply_p, stop_reply_queue,
+		 ignore_child_of_pending_fork, &param);
+}
+
+/* Implement the to_update_thread_list function for the remote
+   targets.  */
+
+static void
+remote_update_thread_list (struct target_ops *ops)
+{
+  struct remote_state *rs = get_remote_state ();
+  struct threads_listing_context context;
+  struct cleanup *old_chain;
+  int got_list = 0;
+
+  context.items = NULL;
+  old_chain = make_cleanup (clear_threads_listing_context, &context);
+
+  /* We have a few different mechanisms to fetch the thread list.  Try
+     them all, starting with the most preferred one first, falling
+     back to older methods.  */
+  if (remote_get_threads_with_qxfer (ops, &context)
+      || remote_get_threads_with_qthreadinfo (ops, &context)
+      || remote_get_threads_with_ql (ops, &context))
+    {
+      int i;
+      struct thread_item *item;
+      struct thread_info *tp, *tmp;
+
+      got_list = 1;
+
+      if (VEC_empty (thread_item_t, context.items)
+	  && remote_thread_always_alive (ops, inferior_ptid))
+	{
+	  /* Some targets don't really support threads, but still
+	     reply an (empty) thread list in response to the thread
+	     listing packets, instead of replying "packet not
+	     supported".  Exit early so we don't delete the main
+	     thread.  */
+	  do_cleanups (old_chain);
+	  return;
+	}
+
+      /* CONTEXT now holds the current thread list on the remote
+	 target end.  Delete GDB-side threads no longer found on the
+	 target.  */
+      ALL_THREADS_SAFE (tp, tmp)
+	{
+	  for (i = 0;
+	       VEC_iterate (thread_item_t, context.items, i, item);
+	       ++i)
+	    {
+	      if (ptid_equal (item->ptid, tp->ptid))
+		break;
+	    }
+
+	  if (i == VEC_length (thread_item_t, context.items))
+	    {
+	      /* Not found.  */
+	      delete_thread (tp->ptid);
+	    }
+	}
+
+      /* Remove any unreported fork child threads from CONTEXT so
+	 that we don't interfere with follow fork, which is where
+	 creation of such threads is handled.  */
+      ignore_new_fork_children (&context);
+
+      /* And now add threads we don't know about yet to our list.  */
+      for (i = 0;
+	   VEC_iterate (thread_item_t, context.items, i, item);
+	   ++i)
+	{
+	  if (!ptid_equal (item->ptid, null_ptid))
+	    {
+	      struct private_thread_info *info;
+	      /* In non-stop mode, we assume new found threads are
+		 running until proven otherwise with a stop reply.  In
+		 all-stop, we can only get here if all threads are
+		 stopped.  */
+	      int running = non_stop ? 1 : 0;
+
+	      remote_notice_new_inferior (item->ptid, running);
+
+	      info = demand_private_info (item->ptid);
+	      info->core = item->core;
+	      info->extra = item->extra;
+	      item->extra = NULL;
+	    }
+	}
+    }
+
+  if (!got_list)
+    {
+      /* If no thread listing method is supported, then query whether
+	 each known thread is alive, one by one, with the T packet.
+	 If the target doesn't support threads at all, then this is a
+	 no-op.  See remote_thread_alive.  */
+      prune_threads ();
+    }
+
+  do_cleanups (old_chain);
+}
+
 /* Remove stop replies in the queue if its pid is equal to the given
    inferior's pid.  */
 
@@ -7945,6 +8090,69 @@ getpkt_or_notif_sane (char **buf, long *sizeof_buf, int forever,
 				 is_notif);
 }
 
+/* Check whether EVENT is a fork event for the process specified
+   by the pid passed in DATA, and if it is, kill the fork child.  */
+
+static int
+kill_child_of_pending_fork (QUEUE (stop_reply_p) *q,
+			    QUEUE_ITER (stop_reply_p) *iter,
+			    stop_reply_p event,
+			    void *data)
+{
+  struct queue_iter_param *param = data;
+  int parent_pid = *(int *) param->input;
+
+  if (is_pending_fork_parent (&event->ws, parent_pid, event->ptid))
+    {
+      struct remote_state *rs = get_remote_state ();
+      int child_pid = ptid_get_pid (event->ws.value.related_pid);
+      int res;
+
+      res = remote_vkill (child_pid, rs);
+      if (res != 0)
+	error (_("Can't kill fork child process %d"), child_pid);
+    }
+
+  return 1;
+}
+
+/* Kill any new fork children of process PID that haven't been
+   processed by follow_fork.  */
+
+static void
+kill_new_fork_children (int pid, struct remote_state *rs)
+{
+  struct thread_info *thread;
+  struct notif_client *notif = &notif_client_stop;
+  struct queue_iter_param param;
+
+  /* Kill the fork child threads of any threads in process PID
+     that are stopped at a fork event.  */
+  ALL_NON_EXITED_THREADS (thread)
+    {
+      struct target_waitstatus *ws = &thread->pending_follow;
+
+      if (is_pending_fork_parent(ws, pid, thread->ptid))
+	{
+	  struct remote_state *rs = get_remote_state ();
+	  int child_pid = ptid_get_pid (ws->value.related_pid);
+	  int res;
+
+	  res = remote_vkill (child_pid, rs);
+	  if (res != 0)
+	    error (_("Can't kill fork child process %d"), child_pid);
+	}
+    }
+
+  /* Check for any pending fork events (not reported or processed yet)
+     in process PID and kill those fork child threads as well.  */
+  remote_notif_get_pending_events (notif);
+  param.input = &pid;
+  param.output = NULL;
+  QUEUE_iterate (stop_reply_p, stop_reply_queue,
+		 kill_child_of_pending_fork, &param);
+}
+
 
 static void
 remote_kill (struct target_ops *ops)
@@ -8014,6 +8222,11 @@ extended_remote_kill (struct target_ops *ops)
   int pid = ptid_get_pid (inferior_ptid);
   struct remote_state *rs = get_remote_state ();
 
+  /* If we're stopped while forking and we haven't followed yet, kill the
+     child task.  We need to do this before killing the parent task
+     because if this is a vfork then the parent will be sleeping.  */
+  kill_new_fork_children (pid, rs);
+
   res = remote_vkill (pid, rs);
   if (res == -1 && !(rs->extended && remote_multi_process_p (rs)))
     {
@@ -11957,6 +12170,14 @@ Specify the serial device it is connected to (e.g. /dev/ttya).";
   extended_remote_ops.to_supports_disable_randomization
     = extended_remote_supports_disable_randomization;
   extended_remote_ops.to_follow_fork = remote_follow_fork;
+  extended_remote_ops.to_insert_fork_catchpoint
+    = remote_insert_fork_catchpoint;
+  extended_remote_ops.to_remove_fork_catchpoint
+    = remote_remove_fork_catchpoint;
+  extended_remote_ops.to_insert_vfork_catchpoint
+    = remote_insert_vfork_catchpoint;
+  extended_remote_ops.to_remove_vfork_catchpoint
+    = remote_remove_vfork_catchpoint;
 }
 
 static int
-- 
1.7.0.4


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]