Bug 31010 - Evaluating "mixed-cased" fundamental types causes hangs/OOM due to full symtab expansion
Summary: Evaluating "mixed-cased" fundamental types causes hangs/OOM due to full symta...
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: 13.1
: P1 normal
Target Milestone: 17.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks: 29366 32733
  Show dependency treegraph
 
Reported: 2023-10-30 16:38 UTC by Josh Cottingham
Modified: 2025-03-08 18:18 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2025-02-20 00:00:00
Project(s) to access:
ssh public key:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Cottingham 2023-10-30 16:38:43 UTC
Attempting to evaluate "mixed/uper-case" C/C++ fundamental types (such as "output LONG" or "output Int") will cause GDB to try to fully expand the whole symbol table. Depending on how large the binary we are debugging, this could cause GDB to hang for several minutes and has been observed to cause Out-of-Memory issues on some circumstances.

Via a git bisect, it was observed that this regression first occurs on the commit which enables the new DWARF indexer: https://sourceware.org/pipermail/gdb-patches/2022-April/187417.html

Easiest way to reproduce this issue is to attempt to run GDB on a debug build of GDB:

$ gdb ./gdb/gdb
(gdb) start
Temporary breakpoint 1 at 0x410675: file gdb.c, line 28.
Starting program: /path/to/binutils-gdb/gdb/gdb 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffd578) at gdb.c:28
28	  memset (&args, 0, sizeof args);
(gdb) output LONG
No symbol "LONG" in current context.

This will take ~20-30 seconds, though as stated above with larger binaries it has been observed to take make longer or cause Out-of-Memory. Worth noting the hang does not occur if we attempt to output other non-existent symbols such as "output NOTREAL".
Comment 1 Tom Tromey 2023-10-31 17:30:28 UTC
cooked_index_functions::expand_symtabs_matching defers to the
index, which uses case-insensitive lookup to handle things like
Ada and Fortran.

This probably needs some refinement.
Maybe a lookup_name ought to carry a language along with it.

Anyway more filtering could be done in the inner loop of that
function.
Comment 2 Tom Tromey 2024-02-15 17:02:57 UTC
Closing this as a dup, since it seems to be basically the
same problem as the other bug, and that bug has a patch.
Worth noting that this problem is reported as being fixed
by that patch as well.

*** This bug has been marked as a duplicate of bug 30520 ***
Comment 3 Josh Cottingham 2025-02-04 15:38:45 UTC
After trying with both GDB 15.1 and GDB 16.1 I am lead to believe that this issue has not actually been solved in bug 30520.

When testing with the following patch https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html it did appear to be solved and no longer occurring.

However, since then it looks like it was proposed to move to from name matching the full name to name matching on the canonical name. By the time the patch was implemented again with this change in https://sourceware.org/pipermail/gdb-patches/2024-May/209010.html it appears that this issue was still occurring.
Comment 4 Josh Cottingham 2025-02-05 10:15:25 UTC
For simplicity, I have created a GDB test which you can use for validation if you like:


# Copyright 1998-2025 Free Software Foundation, Inc.

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

# This file was written by Joshua Cottingham (josh.cottingham@linaro.org)

# Used to catch regression where GDB would expand CU table when printing
# fundamental types that are in "mixed-case".

standard_testfile break.c break1.c

if {[prepare_for_testing "failed to prepare" $testfile \
	 [list $srcfile $srcfile2] {debug nowarnings}]} {
    return -1
}

set readnow_p [readnow]

# The command we test here produce many lines of output; disable "press
# <return> to continue" prompts.
gdb_test_no_output "set height 0"

gdb_file_cmd ${binfile}

# Check no CU are currently read before starting the program
gdb_test "maint print statistics" \
    ".*Number of read CUs: 0.*" \
    "No CU are read at the start"

gdb_test "print INT" "" "printing uppercase fundamental type"

# Now check CUs have still not been read after printing
gdb_test "maint print statistics" \
    ".*Number of read CUs: 0.*" \
    "No CU are read after printing fundamental type"
``
Comment 5 Tom Tromey 2025-02-20 22:23:58 UTC
Thanks, can confirm.
Comment 6 Tom Tromey 2025-02-21 16:27:13 UTC
I'm going to fix this in the easiest way; but there is
a better way and I'll file a follow-up bug for that.
Comment 7 Tom Tromey 2025-02-22 00:40:12 UTC
I have a patch that I'll send reasonably soon.
It depends on another patch I sent this past week.
Comment 9 Sourceware Commits 2025-03-08 00:15:39 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=aab2ac34d7f78f0b7a42cef0187dc6e4d7ec4f02

commit aab2ac34d7f78f0b7a42cef0187dc6e4d7ec4f02
Author: Tom Tromey <tom@tromey.com>
Date:   Fri Feb 21 09:18:28 2025 -0700

    Avoid excessive CU expansion on failed matches
    
    PR symtab/31010 points out that something like "ptype INT" will expand
    all CUs in a typical program.  The OP further points out that the
    original patch for PR symtab/30520:
    
        https://sourceware.org/pipermail/gdb-patches/2024-January/205924.html
    
    ... did solve the problem, but the patch changed after (my) review and
    reintroduced the bug.
    
    In cooked_index_functions::expand_symtabs_matching, the final
    component of a split name is compared with the entry's name using the
    usual method of calling get_symbol_name_matcher.
    
    This code iterates over languages and tries to split the original name
    according to each style.  But, the Ada splitter uses the decoded name
    -- "int".  This causes every C or C++ CU to be expanded.
    
    Clearly this is wrong.  And, it seems to me that looping over
    languages and trying to guess the splitting style for the input text
    is probably bad.  However, fixing the problem is not so easy (again
    due to Ada).  I've filed a follow-up bug, PR symtab/32733, for this.
    
    Meanwhile, this patch changes the code to be closer to the
    originally-submitted patch.  This works because the comparison is now
    done between the full name and the "lookup_name_without_params"
    object, which is a less adulterated variant of the original input.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31010
    Tested-By: Simon Marchi <simon.marchi@efficios.com>
Comment 10 Tom Tromey 2025-03-08 00:19:47 UTC
Fixed.