This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303
- From: Doug Evans <dje at google dot com>
- To: Keith Seitz <keiths at redhat dot com>
- Cc: donb at codesourcery dot com, "gdb-patches at sourceware dot org ml" <gdb-patches at sourceware dot org>
- Date: Mon, 11 Jan 2016 22:34:49 +0000
- Subject: Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303
- Authentication-results: sourceware.org; auth=none
Keith Seitz writes:
> Hi,
>
> Thank you for pointing this out and supplying a patch (and *tests*!).
>
> On 01/08/2016 10:44 AM, Don Breazeal wrote:
>
> > During GDB's parsing of the linespec, when the filename was not found
in
> > the program's symbols, GDB tried to determine if the filename string
> > could be some other valid identifier. Eventually it would get to a
point
> > where it concluded that the Windows logical volume, or whatever was to
the
> > left of the colon, was a C++ namespace. When it found only one colon,
it
> > would assert.
>
> I have to admit, when I first read this, my reaction was that this is a
> symbol lookup error. After investigating it a bit, I think it is. [More
> rationale below.]
>
> > GDB's linespec grammar allows for several different linespecs that
contain
> > ':'. The only one that has a number after the colon is
filename:linenum.
>
> True, but for how long? I don't know. The parser actually does/could (or
> for some brief time *did*) support offsets just about anywhere, e.g.,
> foo.c:function:label:3. That support was removed and legacy (buggy)
> behavior of ignoring the offset was maintained in the parser/sal
> converter. There is no reason we couldn't support it, though.
>
> > There is another possible solution. After no symbols are found for the
> > file and we determine that it is a filename:linenum style linespec, we
> > could just consume the filename token and parse the rest of the
> > linespec. That works as well, but there is no advantage to it.
>
> And, of course, I came up with an entirely different solution. :-)
>
> Essentially what is happening is that the linespec parser is calling
> lookup_symbol on the user input (e.g., "spaces: and :colons.cc"). That
> is causing several problems, all which assume the input is well-formed.
> As you've discovered, that is a pretty poor assumption. Malformed (user)
> input should not cause an assertion failure IMO.
>
> > I created two new tests for this. One is just gdb.linespec/ls-errs.exp
> > copied and converted to use C++ instead of C, and to add the Windows
> > logical drive case. The other is an MI test to provide a regression
> > test for the specific case reported in PR 18303.
>
> EXCELLENT! Thank you so much for doing this. These tests were
> outrageously useful while attempting to understand the problem.
>
> With that said, I offer *for discussion* this alternative solution,
> which removes the assumption that input to lookup_symbol is/must be
> well-formed.
>
> [There is one additional related/necessary fixlet snuck in here... The
> C++ name parser always assumed that ':' was followed by another ':'. Bad
> assumption. So it would return an incorrect result in the malformed
case.]
>
> WDYT?
>
> Keith
>
> [apologies on mailer wrapping]
>
> Author: Keith Seitz <keiths@redhat.com>
> Date: Fri Jan 8 14:00:22 2016 -0800
>
> Tolerate malformed input for lookup_symbol-called functions.
>
> lookup_symbol is often called with user input. Consequently, any
> function called from lookup_symbol{,_in_language} should attempt to deal
> with malformed input gracefully. After all, malformed user input is
> not a programming/API error.
>
> This patch does not attempt to find/correct all instances of this.
> It only fixes locations in the code that trigger test suite failures.
>
> gdb/ChangeLog
>
> * cp-namespace.c (cp_lookup_bare_symbol): Do not assert on
> user input.
> (cp_search_static_and_baseclasses): Return null_block_symbol for
> malformed input.
> Remove assertions.
> * cp-support.c (cp_find_first_component_aux): Do not return
> a prefix length for ':' unless the next character is also ':'.
>
> diff --git a/gdb/cp-namespace.c b/gdb/cp-namespace.c
> index 72002d6..aa225fe 100644
> --- a/gdb/cp-namespace.c
> +++ b/gdb/cp-namespace.c
> @@ -166,12 +166,6 @@ cp_lookup_bare_symbol (const struct language_defn
> *langdef,
> {
> struct block_symbol sym;
>
> - /* Note: We can't do a simple assert for ':' not being in NAME because
> - ':' may be in the args of a template spec. This isn't intended to
be
> - a complete test, just cheap and documentary. */
> - if (strchr (name, '<') == NULL && strchr (name, '(') == NULL)
> - gdb_assert (strchr (name, ':') == NULL);
> -
Heya.
The assert is intended to catch (some) violations of this
(from the function comment):
NAME is guaranteed to not have any scope (no "::") in its name, though
if for example NAME is a template spec then "::" may appear in the
argument list.
This is an invariant on what name can (and cannot) be.
IOW, it is wrong to call this function with name == "foo::bar",
handling that is the caller's responsibility.
Thus I think having an assert here is ok and good
(as long as it tests for the correct thing!).
Whether it is ok to call this with name == "c:mumble" is the issue.
[or even "c:::mumble" or whatever else a user could type that
this function's caller isn't expected to handle]
On that I'm kinda ambivalent, but I like having the assert
watch for the stated invariant.
Thoughts?
> sym = lookup_symbol_in_static_block (name, block, domain);
> if (sym.symbol != NULL)
> return sym;
> @@ -246,10 +240,9 @@ cp_search_static_and_baseclasses (const char *name,
> struct block_symbol klass_sym;
> struct type *klass_type;
>
> - /* The test here uses <= instead of < because Fortran also uses this,
> - and the module.exp testcase will pass "modmany::" for NAME here.
*/
> - gdb_assert (prefix_len + 2 <= strlen (name));
> - gdb_assert (name[prefix_len + 1] == ':');
> + /* Check for malformed input. */
> + if (prefix_len + 2 > strlen (name) || name[prefix_len + 1] != ':')
> + return null_block_symbol;
>
> /* Find the name of the class and the name of the method, variable,
> etc. */
>
> diff --git a/gdb/cp-support.c b/gdb/cp-support.c
> index df127c4..a71c6ad 100644
> --- a/gdb/cp-support.c
> +++ b/gdb/cp-support.c
> @@ -1037,8 +1037,13 @@ cp_find_first_component_aux (const char *name,
> int permissive)
> return strlen (name);
> }
> case '\0':
> - case ':':
> return index;
> + case ':':
> + /* ':' marks a component iff the next character is also a ':'.
> + Otherwise it is probably malformed input. */
> + if (name[index + 1] == ':')
> + return index;
> + break;
What if name[index+2] is also ':'? :-)
[btw, I think "a::::b" is illegal (note four colons),
but I don't know that for sure]
> case 'o':
> /* Operator names can screw up the recursion. */
> if (operator_possible
>
--
/dje