This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFA: auto-retry TCP connections for "target remote"


When cross-debugging on a target that uses a debugging stub or external simulator that speaks GDB remote protocol, we'd like to have Eclipse automatically launch the stub, as well as GDB, when the user clicks on the "Debug" button. Unfortunately, there's a race condition here: GDB may try to open the TCP port to the stub before the stub has finished its own initialization and started listening for a connection, which leaves GDB with a "connection refused" error and the stub stuck waiting for a connection that never arrives.

The attached patch addresses this by adding an auto-retry feature to "target remote" TCP connections, and extending the existing timeout logic to cover polling for retry attempts as well as waiting for connections that are merely slow. I've done a fair amount of manual testing with this in both Linux and Mingw32 host builds and it seems to DTRT reliably.

Does this look OK? I'm not really wedded to the details here, if others want to argue for different command names, different timeout handling, etc.

-Sandra

2008-12-26  Sandra Loosemore  <sandra@codesourcery.com>

	gdb/
	* serial.c (serial_auto_retry, serial_retry_limit): Define.
	* serial.h (serial_auto_retry, serial_retry_limit): Declare.
	* remote.c (_initialize_remote): Add auto-retry and connect-timeout
	commands.
	* ser-tcp.c (TIMEOUT): Remove, in favor of serial_retry_limit.
	(POLL_INTERVAL): Increase to 5, in favor of backoff logic.
	(wait_for_connect): New function.
	(net_open): Use it.  Add auto-retry logic.
	* NEWS: Document new commands.

	gdb/doc/
	* gdb.texinfo (Remote Configuration): Document new
	"set/show remote auto-retry" and "set/show remote connect-timeout"
	commands.
Index: gdb/NEWS
===================================================================
RCS file: /cvs/src/src/gdb/NEWS,v
retrieving revision 1.296
diff -u -r1.296 NEWS
--- gdb/NEWS	2 Dec 2008 07:57:36 -0000	1.296
+++ gdb/NEWS	26 Dec 2008 19:40:25 -0000
@@ -169,6 +169,14 @@
   with GDB while the target is running.  "show target-async" displays the
   current state of asynchronous execution of the target.
 
+set remote auto-retry (on|off)
+show remote auto-retry
+set remote connect-timeout
+show remote connect-timeout
+  These commands allow GDB to retry failed TCP connections to a remote stub
+  with a specified timeout period; this is useful if the stub is launched
+  in parallel with GDB but may not be ready to accept connections immediately.
+
 macro define
 macro list
 macro undef
Index: gdb/remote.c
===================================================================
RCS file: /cvs/src/src/gdb/remote.c,v
retrieving revision 1.335
diff -u -r1.335 remote.c
--- gdb/remote.c	12 Dec 2008 13:45:43 -0000	1.335
+++ gdb/remote.c	26 Dec 2008 19:40:25 -0000
@@ -9132,6 +9132,21 @@
 Show the remote pathname for \"run\""), NULL, NULL, NULL,
 				   &remote_set_cmdlist, &remote_show_cmdlist);
 
+  add_setshow_boolean_cmd ("auto-retry", class_obscure,
+			   &serial_auto_retry, _("\
+Set auto-retry on socket connect"), _("\
+Show auto-retry on socket connect"), 
+			   NULL, NULL, NULL,
+			   &remote_set_cmdlist, &remote_show_cmdlist);
+
+  add_setshow_uinteger_cmd ("connect-timeout", class_obscure,
+			    &serial_retry_limit, _("\
+Set timeout limit for socket connection"), _("\
+Show timeout limit for socket connection"),
+			   NULL, NULL, NULL,
+			   &remote_set_cmdlist, &remote_show_cmdlist);
+
+
   /* Eventually initialize fileio.  See fileio.c */
   initialize_remote_fileio (remote_set_cmdlist, remote_show_cmdlist);
 
Index: gdb/ser-tcp.c
===================================================================
RCS file: /cvs/src/src/gdb/ser-tcp.c,v
retrieving revision 1.30
diff -u -r1.30 ser-tcp.c
--- gdb/ser-tcp.c	3 Sep 2008 23:54:19 -0000	1.30
+++ gdb/ser-tcp.c	26 Dec 2008 19:40:25 -0000
@@ -49,6 +49,7 @@
 
 #include <signal.h>
 #include "gdb_string.h"
+#include "gdb_select.h"
 
 #ifndef HAVE_SOCKLEN_T
 typedef int socklen_t;
@@ -56,10 +57,79 @@
 
 void _initialize_ser_tcp (void);
 
-/* seconds to wait for connect */
-#define TIMEOUT 15
 /* how many times per second to poll deprecated_ui_loop_hook */
-#define POLL_INTERVAL 2
+#define POLL_INTERVAL 5
+
+/* Helper function to wait a while.  If SCB is non-null, wait on its
+   file descriptor.  Otherwise just wait on a timeout, updating *POLLS.
+   Returns -1 on timeout or interrupt, otherwise the value of select.  */
+
+static int
+wait_for_connect (struct serial *scb, int *polls)
+{
+  struct timeval t;
+  int n;
+
+  /* While we wait for the connect to complete, 
+     poll the UI so it can update or the user can 
+     interrupt.  */
+  if (deprecated_ui_loop_hook && deprecated_ui_loop_hook (0))
+    {
+      errno = EINTR;
+      return -1;
+    }
+
+  /* Check for timeout.  */
+  if (*polls > serial_retry_limit * POLL_INTERVAL)
+    {
+      errno = ETIMEDOUT;
+      return -1;
+    }
+
+  /* Back off to polling once per second after the first POLL_INTERVAL
+     polls.  */
+  if (*polls < POLL_INTERVAL)
+    {
+      t.tv_sec = 0;
+      t.tv_usec = 1000000 / POLL_INTERVAL;
+    }
+  else
+    {
+      t.tv_sec = 1;
+      t.tv_usec = 0;
+    }
+
+  if (scb)
+    {
+      fd_set rset, wset, eset;
+      FD_ZERO (&rset);
+      FD_SET (scb->fd, &rset);
+      wset = rset;
+      eset = rset;
+	  
+      /* POSIX systems return connection success or failure by signalling
+	 wset.  Windows systems return success in wset and failure in
+	 eset.
+     
+	 We must call select here, rather than gdb_select, because
+	 the serial structure has not yet been initialized - the
+	 MinGW select wrapper will not know that this FD refers
+	 to a socket.  */
+      n = select (scb->fd + 1, &rset, &wset, &eset, &t);
+    }
+  else
+    /* Use gdb_select here, since we have no file descriptors, and on
+       Windows, plain select doesn't work in that case.  */
+    n = gdb_select (0, NULL, NULL, NULL, &t);
+
+  /* If we didn't time out, only count it as one poll.  */
+  if (n > 0 || *polls < POLL_INTERVAL)
+    (*polls)++;
+  else
+    (*polls) += POLL_INTERVAL;
+
+  return n;
+}
 
 /* Open a tcp socket */
 
@@ -76,6 +146,7 @@
 #else
   int ioarg;
 #endif
+  int polls = 0;
 
   use_udp = 0;
   if (strncmp (name, "udp:", 4) == 0)
@@ -108,6 +179,13 @@
       return -1;
     }
 
+  sockaddr.sin_family = PF_INET;
+  sockaddr.sin_port = htons (port);
+  memcpy (&sockaddr.sin_addr.s_addr, hostent->h_addr,
+	  sizeof (struct in_addr));
+
+ retry:
+
   if (use_udp)
     scb->fd = socket (PF_INET, SOCK_DGRAM, 0);
   else
@@ -116,11 +194,6 @@
   if (scb->fd < 0)
     return -1;
   
-  sockaddr.sin_family = PF_INET;
-  sockaddr.sin_port = htons (port);
-  memcpy (&sockaddr.sin_addr.s_addr, hostent->h_addr,
-	  sizeof (struct in_addr));
-
   /* set socket nonblocking */
   ioarg = 1;
   ioctl (scb->fd, FIONBIO, &ioarg);
@@ -128,68 +201,51 @@
   /* Use Non-blocking connect.  connect() will return 0 if connected already. */
   n = connect (scb->fd, (struct sockaddr *) &sockaddr, sizeof (sockaddr));
 
-  if (n < 0
+  if (n < 0)
+    {
 #ifdef USE_WIN32API
-      /* Under Windows, calling "connect" with a non-blocking socket
-	 results in WSAEWOULDBLOCK, not WSAEINPROGRESS.  */
-      && WSAGetLastError() != WSAEWOULDBLOCK
+      int err = WSAGetLastError();
 #else
-      && errno != EINPROGRESS
+      int err = errno;
 #endif
-      )
-    {
+
+      /* Maybe we're waiting for the remote target to become ready to
+	 accept connections.  */
+      if (serial_auto_retry
 #ifdef USE_WIN32API
-      errno = WSAGetLastError();
+	  && err == WSAECONNREFUSED
+#else
+	  && err == ECONNREFUSED
 #endif
-      net_close (scb);
-      return -1;
-    }
+	  && wait_for_connect (NULL, &polls) >= 0)
+	{
+	  close (scb->fd);
+	  goto retry;
+	}
 
-  if (n)
-    {
-      /* looks like we need to wait for the connect */
-      struct timeval t;
-      fd_set rset, wset, eset;
-      int polls = 0;
-      FD_ZERO (&rset);
+      if (
+#ifdef USE_WIN32API
+	  /* Under Windows, calling "connect" with a non-blocking socket
+	     results in WSAEWOULDBLOCK, not WSAEINPROGRESS.  */
+	  err != WSAEWOULDBLOCK
+#else
+	  err != EINPROGRESS
+#endif
+	  )
+	{
+	  errno = err;
+	  net_close (scb);
+	  return -1;
+	}
 
+      /* looks like we need to wait for the connect */
       do 
 	{
-	  /* While we wait for the connect to complete, 
-	     poll the UI so it can update or the user can 
-	     interrupt.  */
-	  if (deprecated_ui_loop_hook)
-	    {
-	      if (deprecated_ui_loop_hook (0))
-		{
-		  errno = EINTR;
-		  net_close (scb);
-		  return -1;
-		}
-	    }
-	  
-	  FD_SET (scb->fd, &rset);
-	  wset = rset;
-	  eset = rset;
-	  t.tv_sec = 0;
-	  t.tv_usec = 1000000 / POLL_INTERVAL;
-	  
-	  /* POSIX systems return connection success or failure by signalling
-	     wset.  Windows systems return success in wset and failure in
-	     eset.
-
-	     We must call select here, rather than gdb_select, because
-	     the serial structure has not yet been initialized - the
-	     MinGW select wrapper will not know that this FD refers
-	     to a socket.  */
-	  n = select (scb->fd + 1, &rset, &wset, &eset, &t);
-	  polls++;
+	  n = wait_for_connect (scb, &polls);
 	} 
-      while (n == 0 && polls <= TIMEOUT * POLL_INTERVAL);
-      if (n < 0 || polls > TIMEOUT * POLL_INTERVAL)
+      while (n == 0);
+      if (n < 0)
 	{
-	  if (polls > TIMEOUT * POLL_INTERVAL)
-	    errno = ETIMEDOUT;
 	  net_close (scb);
 	  return -1;
 	}
@@ -207,6 +263,18 @@
     res = getsockopt (scb->fd, SOL_SOCKET, SO_ERROR, (void *) &err, &len);
     if (res < 0 || err)
       {
+	/* Maybe the target still isn't ready to accept the connection.  */
+	if (serial_auto_retry
+#ifdef USE_WIN32API
+	    && err == WSAECONNREFUSED
+#else
+	    && err == ECONNREFUSED
+#endif
+	    && wait_for_connect (NULL, &polls) >= 0)
+	  {
+	    close (scb->fd);
+	    goto retry;
+	  }
 	if (err)
 	  errno = err;
 	net_close (scb);
Index: gdb/serial.c
===================================================================
RCS file: /cvs/src/src/gdb/serial.c,v
retrieving revision 1.33
diff -u -r1.33 serial.c
--- gdb/serial.c	23 Feb 2008 20:04:20 -0000	1.33
+++ gdb/serial.c	26 Dec 2008 19:40:25 -0000
@@ -695,6 +695,9 @@
 static struct cmd_list_element *serial_set_cmdlist;
 static struct cmd_list_element *serial_show_cmdlist;
 
+int serial_auto_retry = 1;
+int serial_retry_limit = 15;
+
 static void
 serial_set_cmd (char *args, int from_tty)
 {
Index: gdb/serial.h
===================================================================
RCS file: /cvs/src/src/gdb/serial.h,v
retrieving revision 1.19
diff -u -r1.19 serial.h
--- gdb/serial.h	1 Jan 2008 22:53:12 -0000	1.19
+++ gdb/serial.h	26 Dec 2008 19:40:25 -0000
@@ -283,4 +283,13 @@
 
 #endif /* USE_WIN32API */
 
+/* Whether to auto-retry refused connections.  */
+
+extern int serial_auto_retry;
+
+/* Timeout period for connections, in seconds.  */
+
+extern int serial_retry_limit;
+
+
 #endif /* SERIAL_H */
Index: gdb/doc/gdb.texinfo
===================================================================
RCS file: /cvs/src/src/gdb/doc/gdb.texinfo,v
retrieving revision 1.541
diff -u -r1.541 gdb.texinfo
--- gdb/doc/gdb.texinfo	16 Dec 2008 06:14:00 -0000	1.541
+++ gdb/doc/gdb.texinfo	26 Dec 2008 19:40:27 -0000
@@ -14206,6 +14206,34 @@
 extended-remote}.  This should be set to a filename valid on the
 target system.  If it is not set, the target will use a default
 filename (e.g.@: the last program run).
+
+@item set remote auto-retry on
+@cindex auto-retry, for remote target
+Enable auto-retry for remote TCP connections.  This is useful if the remote
+debugging agent is launched in parallel with @value{GDBN}; there is a race
+condition because the agent may not become ready to accept the connection
+before @value{GDBN} attempts to connect.  When auto-retry is
+enabled, if the initial attempt to connect fails, @value{GDBN} reattempts
+to establish the connection using the timeout specified by 
+@code{set remote connect-timeout}.
+
+@item set remote auto-retry off
+Do not auto-retry failed TCP connections.
+
+@item show remote auto-retry
+Show the current auto-retry setting.
+
+@item set remote connect-timeout @var{seconds}
+@cindex connection timeout, for remote target
+@cindex timeout, for remote target connection
+Set the timeout for establishing a TCP connection to the remote target to
+@var{seconds}.  The timeout affects both polling to retry failed connections 
+(enabled by @code{set remote auto-retry on}) and waiting for connections
+that are merely slow to complete, and represents an approximate cumulative
+value.
+
+@item show remote connect-timeout
+Show the current connection timeout setting.
 @end table
 
 @cindex remote packets, enabling and disabling

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]