This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GDB/MI Output Syntax


Bob Rossi <bob@brasko.net> wrote:
> so far, it seems to parse everything I throw at it. However, I haven't
> tested it to much because I am building an intermediate representation.
> This is what I'll use from the front end.

How can we hook this up with the gdb test suite?

I've got a corpus of gdb.log files.  Someone could write some Perl
script to pick out pieces and invoke your parser as an external program.
It might help to add a few more rules at the top:

  session                 -> input_output_pair_list
  input_output_pair_list  -> epsilon | input_output_pair_list input output
  input                   -> ...

The sticky part is that dejagnu mixes its own output into this.
Ick.

Getting into the grammar itself:

Comma separators and lists are kludgy.  In these rules:

  result_record      -> opt_token "^" result_class result_list_prime
  result_list_prime  -> result_list | epsilon
  result_list        -> result_list "," result | "," result

The actual gdb output for a result_record could be either:

  105^done
  103^done,BreakPointTable={...}

It looks a little weird to me to parse the first comma as part
of result_list_prime.  How about:

  result_record  -> opt_token "^" result_class
  result_record  -> opt_token "^" result_class "," result_list
  result_list    -> result | result_list "," result

That simplifies tuple and list as well:

  tuple  -> "{}" | "{" result_list "}"
  list   -> "[]" | "[" value_list "]" | "[ result_list ]"

That simplifies the rules also, because they won't need any special code
to construct a list for: "[" result result_list "]" .

This also gets rid of the foo_prime constructions, which can cause
trouble.  The original oob_record_list_prime caused the original
shift/reduce conflict, because the parser had to decide whether to
reduce an epsilon to oob_record_list_prime or keep shifting and reduce
later to the non-epsilon form of the oob_record_list.

Style point: there is a lot of:

  foo_list -> foo_list foo | epsilon
  bar_list -> bar_list bar | bar

I think this is more readable:

  foo_list -> epsilon | foo_list foo
  bar_list -> bar | bar_list bar

Another nit: how is the grammar even working with:

  nl -> CR | CR_LF

Doesn't this have to be:

  nl -> LF | CR | CR LF

Or is the lexer quietly defining CR_LF to include "\n"?

For coding purposes it would be more efficient to make NL
a single token and have the lexer recognize all three forms.

For doco purposes it might be better to explicitly make nl
a non-terminal and show the LF, CR, CR LF terminals.

Either way is okay, but I'd like to have one or the other:
either have the lexer do all the work, or have the lexer be
stupid simple and have the grammar do the work.

Michael


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]