Bug 32939 - C++ canonicalizer doesn't handle "partial" operator names
Summary: C++ canonicalizer doesn't handle "partial" operator names
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: c++ (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 17.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks: 32936 16994 16998
  Show dependency treegraph
 
Reported: 2025-05-03 23:13 UTC by Tom Tromey
Modified: 2025-05-23 14:53 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Tromey 2025-05-03 23:13:18 UTC
This bug came up while debugging the "search-in-psyms" series
(aka bug#16994 and bug#16998).

gdb.cp/cplusfuncs.exp will produce this symbol:

    <55f>   DW_AT_name        : (indirect string, offset: 0x504): operator new []

However, this name is not in canonical form according to the demangler:

0000000000401540 T foo::operator new[](unsigned long)

That is, g++ emits an extra space in the name.

However, an operator name without parameters cannot be canonicalized:

(gdb) maint canonicalize operator new []
No change.
(gdb) maint canonicalize operator new [](int)
canonical = operator new[](int)
Comment 1 Tom Tromey 2025-05-07 01:38:08 UTC
One idea I had was to hack the parser to add a production
for "EOF" here, then have it automatically add "()".
Then, after canonicalization is done, if "()" was added,
it would be removed again.
Comment 2 Tom Tromey 2025-05-08 20:58:55 UTC
This turned out to be not too difficult to implement,
and my efforts were rewarded by turning one regression
on the search-in-psyms branch into 111 regressions.
It seems there's more to figure out.
Comment 3 Tom Tromey 2025-05-09 01:12:34 UTC
Figured it out.
Comment 5 Sourceware Commits 2025-05-23 14:52:43 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=8d13d83aba4a103959b127cbb5666e28667ac338

commit 8d13d83aba4a103959b127cbb5666e28667ac338
Author: Tom Tromey <tom@tromey.com>
Date:   Thu May 8 14:04:05 2025 -0600

    Handle an argument-less operator in the C++ name parser
    
    While debugging a new failure in my long-suffering "search-in-psyms"
    series, I found that the C++ name canonicalizer did not handle a case
    like "some_name::operator new []".  This should remove the space,
    resulting in "some_name::operator new[]" -- but does not.
    
    This happens because the parser requires an operator to be followed by
    argument types.  That is, it's expected.
    
    However, it seems to me that we do want to be able to canonicalize a
    name like this.  It will appear in the DWARF as a DW_AT_name, and
    furthermore it could be entered by the user.
    
    This patch fixes this problem by changing the grammar to supply the
    "()" itself, then removing the trailing "()" when changing to string
    form (in the functions that matter).
    
    This isn't ideal -- it might miss a very obscure case involving the
    gdb extension of providing fully-qualified names for function-local
    statics -- but it improves the situation at least.
    
    It's possible a better solution might be to rewrite the name
    canonicalizer.  I was wondering if this could perhaps be done without
    reference to the grammar -- just by examining the tokens.  However,
    that's much more involved.
    
    Let me know what you think.
    
    Regression tested on x86-64 Fedora 40.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32939
    Reviewed-By: Keith Seitz <keiths@redhat.com>
Comment 6 Tom Tromey 2025-05-23 14:53:45 UTC
Fixed.