This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] c++/8218: Destructors w/arguments.


On 03/09/2017 01:58 AM, Keith Seitz wrote:

>> Can we trust that FIELD_ARTIFICIAL is always set on the implicit this
>> argument on all debug formats?  I.e., does it work with stabs?
> 
> No, only DWARF, I think. I've tested gdb.cp/*.exp for -gdwarf-2 and
> -gstabs+ for both 64-bit and -m32. In all cases, with or without my v1
> proposed patch, logs show no differences whatsoever, except in the
> -gdwarf-2 cases where the modified templates.exp tests now pass.

That's quite surprising.  There's nothing AFAICS in the stabs
reader setting the artificial flag, so a quick manual test
of "ptype SomeStructWithMethods" against a program built with stabs,
using your patch, "should" incorrectly show the "this" parameter
of all method prototypes.  I tried it myself now, and indeed
GDB still manages to NOT print the "this" parameter.  Using the
gdb.cp/method.cc testcase:

(gdb) ptype A
type = struct A {
    int x;
    int y;
  public:
    int foo(int);
    int bar(int) const;
    int baz(int, char) volatile;
    int qux(int, float) const volatile;
}

So I debugged a little.

Turns out that cp_type_print_method_args is _never_ called for stabs.
But, we ptype still prints the methods just fine.  So... huh?

I stepped through two instances of GDB side by side, one against
the method.cc binary built with stabs, and another with dwarf.

The difference is here, in c-typeprint.c:c_type_print_base :

1243                      demangled_name =
1244                        gdb_demangle (mangled_name,
1245                                      DMGL_ANSI | DMGL_PARAMS);
1246                      if (demangled_name == NULL)
1247                        {
1248                          /* In some cases (for instance with the HP
1249                             demangling), if a function has more than 10
1250                             arguments, the demangling will fail.
1251                             Let's try to reconstruct the function
1252                             signature from the symbol information.  */
1253                          if (!TYPE_FN_FIELD_STUB (f, j))
1254                            {
1255                              int staticp = TYPE_FN_FIELD_STATIC_P (f, j);
1256                              struct type *mtype = TYPE_FN_FIELD_TYPE (f, j);
1257
1258                              cp_type_print_method_args (mtype,
1259                                                         "",
1260                                                         method_name,
1261                                                         staticp,
1262                                                         stream, &local_flags);
1263                            }
1264                          else
1265                            fprintf_filtered (stream,
1266                                              _("<badly mangled name '%s'>"),
1267                                              mangled_name);
1268                        }
1269                      else
1270                        {
1271                          char *p;
1272                          char *demangled_no_class
1273                            = remove_qualifiers (demangled_name);

Note how the "In some cases ..." comment above makes it sounds like that
is a fallback path that we shouldn't really be hitting on modern toolchains.

But, in reality, it's the opposite.  The difference between stabs and dwarf,
is that stabs goes through the (demangled_name != NULL) path while DWARF
goes through the "fallback" (demangled_name == NULL) path.  Because:

Stabs:

 (top-gdb) p mangled_name
 $1 = 0x2fb3940 "_ZN1A3fooEi"
 (top-gdb) p demangled_name
 $2 = 0x2fc80b0 "A::foo(int)"
 (top-gdb) 

DWARF:

 (top-gdb) p mangled_name
 $1 = 0x2fcdfe0 "A::foo(int)"
 (top-gdb) p demangled_name
 $2 = 0x0

Eh, "mangled_name" is fetched from "TYPE_FN_FIELD_PHYSNAME (f, j)"
just above.  And PHYSNAME is the demangled name in DWARF/C++...

I never quite got why that is (why PHYSNAME is the demangled
name in C++), and there are still comments all over the place that
suggest that PHYSNAME is the mangled name.  Plus the variables names
suggest that too.  It's ... confusing.

So just for entertainment, I hacked GDB like this to always force the
cp_type_print_method_args path in stabs too:

diff --git c/gdb/c-typeprint.c w/gdb/c-typeprint.c
index e504618..e750c84 100644
--- c/gdb/c-typeprint.c
+++ w/gdb/c-typeprint.c
@@ -1243,7 +1243,7 @@ c_type_print_base (struct type *type, struct ui_file *stream,
                  demangled_name =
                    gdb_demangle (mangled_name,
                                  DMGL_ANSI | DMGL_PARAMS);
-                 if (demangled_name == NULL)
+                 if (1 || demangled_name == NULL)
                    {
                      /* In some cases (for instance with the HP
                         demangling), if a function has more than 10

And now with stabs, plus your original patch, we get the "expected"
artificial "this" parameters incorrectly printed:

(gdb) ptype A
type = struct A {
    int x;
    int y;
  public:
    int foo(A *, int);
    int bar(const A *, int) const;
    int baz(volatile A *, int, char) volatile;
    int qux(const volatile A *, int, float) const volatile;
}

>> Can you comment on the treatment c_type_print_args gives to
>> artificial arguments that Tom pointed out in the PR?
> 
> As Tom mentions, c_type_print_args will output the types of function
> arguments /unless/ it is being called from the dwarf reader (to
> construct physnames).  This function is called when someone does "ptype
> A::~A". It is /not/ used when one types "ptype A". The dtor in that
> instance is handled by cp_type_print_method_args.
> 
> Detailed explanation of why we don't see the artificial "int" parameter
> in both cases (before my patch):
> 
> Because of how the compiler is outputting dtor debug info, we see
> multiple DIEs describing the destructor.

This got to be the multiple destructors.  (not-in-charge / charge)

See the ABI at <http://mentorembedded.github.io/cxx-abi/abi.html>:

		   ::= D0	# deleting destructor
		   ::= D1	# complete object destructor
		   ::= D2	# base object destructor

Notice, same demangled name:

 $ nm -A testsuite/outputs/gdb.cp/templates/templates | c++filt | grep "T5<int>" | grep "~T5"
 testsuite/outputs/gdb.cp/templates/templates:000000000040184a W T5<int>::~T5()
 testsuite/outputs/gdb.cp/templates/templates:000000000040184a W T5<int>::~T5()

But different mangled names:

 $ nm -A testsuite/outputs/gdb.cp/templates/templates | grep 000000000040184a
 testsuite/outputs/gdb.cp/templates/templates:000000000040184a W _ZN2T5IiED1Ev
 testsuite/outputs/gdb.cp/templates/templates:000000000040184a W _ZN2T5IiED2Ev

In this case, they're aliased, but they'll have different addresses when
virtual inheritance is involved.  Vis:

$ cat virtual.cc
struct A { ~A() {} };
struct VA { virtual ~VA() {} };

struct VA2 : public virtual VA { virtual ~VA2() {} };
struct VA3 : public virtual VA { virtual ~VA3() {} };
struct VA2_3 : public VA2, public VA3 { virtual ~VA2_3() {} };

A a;
VA va;
VA2 va2;
VA3 va3;
VA2_3 va2_3;

int main () {}

$ g++ virtual.cc -o virtual -g3 -O0

$ nm -A virtual | c++filt | grep "~" | grep -v thunk
virtual:000000000040081a W A::~A()
virtual:000000000040081a W A::~A()
virtual:0000000000400856 W VA::~VA()
virtual:0000000000400826 W VA::~VA()
virtual:0000000000400826 W VA::~VA()
virtual:000000000040094c W VA2::~VA2()
virtual:00000000004008ea W VA2::~VA2()
virtual:000000000040087c W VA2::~VA2()
virtual:0000000000400a4c W VA3::~VA3()
virtual:00000000004009ea W VA3::~VA3()
virtual:000000000040097c W VA3::~VA3()
virtual:0000000000400b22 W VA2_3::~VA2_3()
virtual:0000000000400a7c W VA2_3::~VA2_3()

$ nm -A virtual | grep "_ZN.*D[0-2]"
virtual:000000000040081a W _ZN1AD1Ev
virtual:000000000040081a W _ZN1AD2Ev
virtual:0000000000400856 W _ZN2VAD0Ev
virtual:0000000000400826 W _ZN2VAD1Ev
virtual:0000000000400826 W _ZN2VAD2Ev
virtual:000000000040094c W _ZN3VA2D0Ev
virtual:00000000004008ea W _ZN3VA2D1Ev
virtual:000000000040087c W _ZN3VA2D2Ev
virtual:0000000000400a4c W _ZN3VA3D0Ev
virtual:00000000004009ea W _ZN3VA3D1Ev
virtual:000000000040097c W _ZN3VA3D2Ev
virtual:0000000000400b22 W _ZN5VA2_3D0Ev
virtual:0000000000400a7c W _ZN5VA2_3D1Ev

> When one types "ptype A", this uses one of the DIEs. This particular one
> mentions the two artificial parameters. That's why we see (before this
> patch) the "int" parameter.
> 
> When one types "ptype A::~A", this uses a different DIE, which has valid
> PC bounds, but only mentions the first formal parameter. This is why
> when we see only the "const A *" in the description of the parameters.

Yeah.  If there are multiple versions of the "same" symbol, guess ptype
just runs into the first.

(gdb) ptype A::~A
type = void (A * const)
(gdb) ptype VA2_3::~VA2_3
type = void (VA2_3 * const)
(gdb) ptype A::~A
type = void (A * const)
(gdb) ptype VA::~VA
type = void (VA * const)
(gdb) ptype VA2::~VA2
type = void (VA2 * const, const void ** const)
(gdb) ptype VA3::~VA3
type = void (VA3 * const, const void ** const)
(gdb) ptype VA2_3::~VA2_3
type = void (VA2_3 * const)

> 
> I've no explanation as to why we don't hide these implementation details
> other than to offer a truly pedantic view of the code.

I can see value in being able to refer to to the artificial arguments
and distinguish the different overloads.  E.g., if one is trying to debug
one of the multiple ctor/dtors:

(gdb) b VA3::~VA3
Breakpoint 2 at 0x40098c: VA3::~VA3. (3 locations)
(gdb) info breakpoints 
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   <MULTIPLE>         
2.1                         y     0x000000000040098c in VA3::~VA3() at virtual.cc:5
2.2                         y     0x00000000004009f6 in VA3::~VA3() at virtual.cc:5
2.3                         y     0x0000000000400a58 in VA3::~VA3() at virtual.cc:5
(gdb) r
Starting program: /home/pedro/tmp/virtual 

Breakpoint 2, VA3::~VA3 (this=0x602128 <va2_3+8>, __vtt_parm=0x400de8 <VTT for VA2_3+24>, __in_chrg=<optimized out>) at virtual.cc:5
5       struct VA3 : public virtual VA { virtual ~VA3() {} };
(gdb) c
Continuing.

Breakpoint 2, VA3::~VA3 (this=0x602118 <va3>, __in_chrg=<optimized out>, __vtt_parm=<optimized out>) at virtual.cc:5
5       struct VA3 : public virtual VA { virtual ~VA3() {} };
(gdb) c
Continuing.
[Inferior 1 (process 30836) exited normally]

But that's more of a run-time engineer thing.  And such users can
always use the mangled names directly, for example:

 (gdb) ptype _ZN1AD0Ev
 type = void (A * const)

A regular user wouldn't ever want to see these artificial things,
I suppose.

Maybe it'd be preferable to print the type of non-static methods
In a more C++-ish way, even, like, given:

 struct A { void foo(); };

instead of the current:

 (gdb) ptype A::foo
 type = void (A * const)

print:

 (gdb) ptype A::foo
 type = void (A::)();

Likewise for method pointers, while at it.  I.e., print this:

 (gdb) ptype &A::foo
 type = void (A::*)()

instead of this:

 (gdb) ptype &A::foo
 type = void (A::*)(A * const)

which just looks incorrect, since it no longer has the excuse
of looking like a function pointer "as seen" through a C lens.

> 
> As best I can describe it, then, "ptype A::~A" (aka c_type_print_args)
> attempts to be as tools-developer-oriented as possible while "ptype
> CLASS" attempts to be more user-centric, hiding internal details like
> artificial constructs.

Yeah...  I suspect it's more an "historic accident" than by design
though.

> 
> Did any of that actually answer your question?

It sure helped, but I for me it was important to understand
why _didn't_ stabs get broken.

This new version is OK.

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]