This is the mail archive of the
mailing list for the Archer project.
Re: Parser rewritting
- From: Chris Moller <cmoller at redhat dot com>
- To: Tom Tromey <tromey at redhat dot com>
- Cc: Sergio Durigan Junior <sergiodj at redhat dot com>, Project Archer <archer at sourceware dot org>
- Date: Thu, 08 Apr 2010 16:21:44 -0400
- Subject: Re: Parser rewritting
- References: <email@example.com> <4BB54D69.firstname.lastname@example.org> <email@example.com>
On 04/08/10 15:21, Tom Tromey wrote:
Chris> A lot of years ago I wrote a fairly elaborate parser using
Chris> antlr--definitely a cool tool and I recommend you consider it.
One thing to ensure is that the antlr output is GPL-compatible.
If not, we can't use it.
antlr.org says that ANTLR itself is under "The BSD License," which looks
to like a small subset of GPLv2, but IANAL. I couldn't find anything
about licensing for the generated code.
Chris> Just as an example, I've attached a rudimentary antlr grammar that
Chris> parses a subset of C/C++ decls
We only need expressions.
Chris> Anyway, it's probably worth considering.
While I still think it makes the most sense to mimic g++, I am open to
other solutions that are powerful enough.
Another thing worth considering is bison's GLR mode. This has the
advantage that we wouldn't actually need to rewrite the whole parser, we
could just start by tweaking it.
Using tools that generate code is problematic in GDB, because people
complain about every new dependency. Even requiring bison will probably
generate complaints, because AFAIK some people still do their builds
with byacc. Maybe we could check in the generated code, though.
With one exception, ANTLR, including v3, under at least Fedora--I don't
know about RHEL. The exception is the v3 C target-language support,
which I had to install separately, but I expect it could be included in
the antlrv3 package.
The generated code is kinda big. The source for the antlr C/C++
expression parser I wrote totals 737 lines, about 500 of which is C
support code--the antlr grammar is only 239 lines. But that 239 lines
gets turned into about 8800 lines of combined lexer and parser.