This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [committed mig] Do not generate code dereferencing type-punned pointers


Hi!

David and Justus, thanks for initially looking into this!

On Mon, 2 Mar 2015 12:51:41 -0500, David Michael <fedora.dm0@gmail.com> wrote:
> On Mon, Mar 2, 2015 at 11:03 AM, Justus Winter
> <4winter@informatik.uni-hamburg.de> wrote:
> > Hi David :)
> >
> > thanks for cleaning up after me ;)  (again and again...)
> >
> > Quoting David Michael (2015-02-18 05:39:46)
> >> On Sun, Feb 15, 2015 at 11:09 AM, Justus Winter
> >> <4winter@informatik.uni-hamburg.de> wrote:
> >> > * utils.c (WriteFieldDeclPrim): Generate a union with an additional
> >> > pointer field for variable-length arrays.
> >>
> >> This makes GDB's awk script go haywire because it doesn't know how to
> >> deal with unions.  The following is a workaround to get it building
> >> again, but I'm not sure of its correctness.  Can someone more
> >> knowledgeable than me check on this?
> >
> > Not that I'm claiming to know why gdb does what it does, but Samuel
> > asked me to look into this.
> 
> Thanks for looking into it.
> 
> > For some reason (proxying?), gdb uses mig to create server-side stubs
> > for the *reply*-half of some protocols.  It then uses an awk-script to
> > insert the following code into the server-stubs:
> >
> > if (In0P->Head.msgh_size == sizeof (Reply)
> >          && ! (In0P->Head.msgh_bits & MACH_MSGH_BITS_COMPLEX)
> >          && ! BAD_TYPECHECK(&In0P->return_codeType, &return_codeCheck)
> >          && In0P->return_code != 0)
> > /* Error return, only the error code argument is passed.  */
> > {
> >   kern_return_t (*sfun)(mach_port_t, kern_return_t, int, data_t, mach_msg_type_number_t, data_t, mach_msg_type_number_t) = S_proc_getprocinfo_reply;
> >   OutP->RetCode = (*(kern_return_t (*)(mach_port_t, kern_return_t))sfun) (In0P->Head.msgh_request_port, In0P->return_code);
> >   return;
> > }
> >
> > This is inserted *before* any type checks.  The awk script has this to
> > say:
> >
> > # The first args type checking statement; we need to insert our chunk of
> > # code that bypasses all the type checks if this is an error return, after
> > # which we're done until we get to the next function.  Handily, the size
> > # of mig's Reply structure is also the size of the alternate Request
> > # structure that we want to check for.
> > [...]
> > # Force the function into a type that only takes the first two args, via
> > # the temp variable SFUN (is there another way to correctly do this cast?).
> > # This is possibly bogus, but easier than supplying bogus values for all
> > # the other args (we can't just pass 0 for them, as they might not be scalar).
> >
> > I don't see why it bothers with SFUN in the first place.  It could just do
> >
> >   OutP->RetCode = (*(kern_return_t (*)(mach_port_t, kern_return_t)) S_proc_getprocinfo_reply) (In0P->Head.msgh_request_port, In0P->return_code);
> 
> That was my first thought as well, but the comment "is there another
> way to correctly do this cast?" made me think I was missing some
> subtlety.  Thomas took care of updating this last time mig was
> modified; maybe he is familiar with its design choices?

Nope, sorry.  :-|

> > right?  If so, the main reason for parsing the struct is gone, and we
> > could simplify the script a lot and thus making it less likely to
> > break in the future.  Getting rid of it entirely would be nice though.
> >
> > A word about the patch.  It seems to cheat by assuming `data_t' as the
> > type as soon at it sees a union.  That shouldn't matter since the type
> > is only used for the stupid cast.  But the better way would be to just
> > get rid of the parsing in the first place, or the whole script when
> > we're at it.
> 
> Yes, I'd be in favor of anything that makes the build process less
> dependent on the exact mig output.  I don't know enough about what gdb
> is doing to remove this script entirely, but I'll see if it still
> works when using a direct cast instead of all the argument parsing.

A good suggestions, thanks.  I now pushed the following two patches to
GDB master:

commit d214e5e79e38b18bc3786b3e8ba0e55fdbba294b
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Mon Apr 20 00:22:32 2015 +0200

    [GDB] Hurd: Simplify the reply_mig_hack.awk script.
    
    	gdb/
    	* reply_mig_hack.awk: Don't bother to declare an intermediate
    	function pointer variable.
    
    ... allowing us to simplify the parsing a little bit.  And, instead of
    "warning: initialization from incompatible pointer type", we now get "warning:
    function called through a non-compatible type".  Oh well.
---
 gdb/ChangeLog          |  5 +++++
 gdb/reply_mig_hack.awk | 25 ++++++++++++++++---------
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git gdb/ChangeLog gdb/ChangeLog
index 69e5c48..093686f 100644
--- gdb/ChangeLog
+++ gdb/ChangeLog
@@ -1,3 +1,8 @@
+2015-04-20  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* reply_mig_hack.awk: Don't bother to declare an intermediate
+	function pointer variable.
+
 2015-04-17  Doug Evans  <dje@google.com>
 
 	* solib-svr4.c (svr4_exec_displacement): Rename outer "displacement"
diff --git gdb/reply_mig_hack.awk gdb/reply_mig_hack.awk
index 2c2a30a..7eab504 100644
--- gdb/reply_mig_hack.awk
+++ gdb/reply_mig_hack.awk
@@ -60,7 +60,7 @@ parse_phase == 3 && /}/ {
   print; next;
 }
 
-parse_phase == 3 {
+parse_phase == 3 && num_args == 0 {
   # The type field for an argument.
   arg_type_code_name[num_args] = $2;
   sub (/;$/, "", arg_type_code_name[num_args]) # Get rid of the semi-colon
@@ -68,11 +68,22 @@ parse_phase == 3 {
   print; next;
 }
 
+parse_phase == 3 && num_args == 1 {
+  # We've got more than one argument (but we don't care what it is).
+  num_args++;
+  print; next;
+}
+
+parse_phase == 3 {
+  # We've know everything we need; now just wait for the end of the Request
+  # struct.
+  print; next;
+}
+
 parse_phase == 4 {
   # The value field for an argument.
   arg_name[num_args] = $2;
   sub (/;$/, "", arg_name[num_args]) # Get rid of the semi-colon
-  arg_type[num_args] = $1;
   num_args++;
   parse_phase = 3;
   print; next;
@@ -109,15 +120,11 @@ parse_phase == 5 && /^#if[ \t]TypeCheck/ {
   print "\t    && In0P->" arg_name[0] " != 0)";
   print "\t  /* Error return, only the error code argument is passed.  */";
   print "\t  {";
-  # Force the function into a type that only takes the first two args, via
-  # the temp variable SFUN (is there another way to correctly do this cast?).
+  # Force the function user_function_name into a type that only takes the first
+  # two arguments.
   # This is possibly bogus, but easier than supplying bogus values for all
   # the other args (we can't just pass 0 for them, as they might not be scalar).
-  printf ("\t    kern_return_t (*sfun)(mach_port_t");
-  for (i = 0; i < num_args; i++)
-    printf (", %s", arg_type[i]);
-  printf (") = %s;\n", user_function_name);
-  print "\t    OutP->RetCode = (*(kern_return_t (*)(mach_port_t, kern_return_t))sfun) (In0P->Head.msgh_request_port, In0P->" arg_name[0] ");";
+  print "\t    OutP->RetCode = (*(kern_return_t (*)(mach_port_t, kern_return_t)) " user_function_name ") (In0P->Head.msgh_request_port, In0P->" arg_name[0] ");";
   print "\t    return;";
   print "\t  }";
   print "";

commit 110f91128cf3e047eb1e04d346c27d71cc33fb9c
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Mon Apr 20 00:31:54 2015 +0200

    [GDB] Hurd: Robustify the reply_mig_hack.awk script.
    
    ..., so that it also works with the GNU MIG 1.5 just released.
    
    	gdb/
    	* reply_mig_hack.awk: Robustify parsing.
---
 gdb/ChangeLog          | 2 ++
 gdb/reply_mig_hack.awk | 8 +++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git gdb/ChangeLog gdb/ChangeLog
index 093686f..4bca7d6 100644
--- gdb/ChangeLog
+++ gdb/ChangeLog
@@ -1,5 +1,7 @@
 2015-04-20  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* reply_mig_hack.awk: Robustify parsing.
+
 	* reply_mig_hack.awk: Don't bother to declare an intermediate
 	function pointer variable.
 
diff --git gdb/reply_mig_hack.awk gdb/reply_mig_hack.awk
index 7eab504..b731115 100644
--- gdb/reply_mig_hack.awk
+++ gdb/reply_mig_hack.awk
@@ -49,7 +49,7 @@ parse_phase == 2 {
   print; next;
 }
 
-parse_phase == 3 && /}/ {
+parse_phase == 3 && /} Request/ {
   # The args structure is over.
   if (num_args > 1)
     parse_phase = 5;
@@ -62,6 +62,9 @@ parse_phase == 3 && /}/ {
 
 parse_phase == 3 && num_args == 0 {
   # The type field for an argument.
+  # This won't be accurate in case of unions being used in the Request struct,
+  # but that doesn't matter, as we'll only be looking at arg_type_code_name[0],
+  # which will not be a union type.
   arg_type_code_name[num_args] = $2;
   sub (/;$/, "", arg_type_code_name[num_args]) # Get rid of the semi-colon
   parse_phase = 4;
@@ -82,6 +85,9 @@ parse_phase == 3 {
 
 parse_phase == 4 {
   # The value field for an argument.
+  # This won't be accurate in case of unions being used in the Request struct,
+  # but that doesn't matter, as we'll only be looking at arg_name[0], which
+  # will not be a union type.
   arg_name[num_args] = $2;
   sub (/;$/, "", arg_name[num_args]) # Get rid of the semi-colon
   num_args++;


GrÃÃe,
 Thomas

Attachment: signature.asc
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]