This is the mail archive of the
mailing list for the Archer project.
Re: Parser rewritting
- From: Jim Blandy <jimb at red-bean dot com>
- To: Tom Tromey <tromey at redhat dot com>
- Cc: Dodji Seketeli <dodji at redhat dot com>, Chris Moller <cmoller at redhat dot com>, Sergio Durigan Junior <sergiodj at redhat dot com>, Project Archer <archer at sourceware dot org>
- Date: Sat, 10 Apr 2010 15:04:58 -0700
- Subject: Re: Parser rewritting
- References: <email@example.com> <4BB24B69.firstname.lastname@example.org> <email@example.com> <20100404084952.GK20524@redhat.com> <firstname.lastname@example.org>
On Thu, Apr 8, 2010 at 12:28 PM, Tom Tromey <email@example.com> wrote:
> I'm not opposed to this but I don't want to slow down our progress to
> make a library.
For what it's worth, isolating a complex component like this makes it
much easier to write unit tests for it.
As an experiment, I did my recent work on Google Breakpad --- a new
symbol dumper for Linux that converts DWARF debugging info and CFI to
Breakpad's own textual format, corresponding extensions to the parser
for that data, and stack walkers for x86, x86_64, and ARM ---
following a discipline of providing full code coverage and branch
coverage (each branch has to be both taken and not taken) with unit
tests for each separable component. It slowed me down quite a bit ---
I spent more time writing tests than code. But except for cases where
I misunderstood the spec, I have also not had any bugs yet in ~5500
non-comment lines of code. Or, more precisely, I had lots of bugs ---
some days I could have stayed in bed and not lost ground --- but none
of them got committed. This full rewrite of the debugging info
dumper, and pretty deep surgery on the stack walker is running on our
production crash-handling servers (crash-stats.mozilla.com), and the
transition has been painless.
What made this possible, though, was that each piece could be taken in
isolation and driven from the Google C++ Test Framework. It was easy
for me to directly check the results of the parser in isolation, not
the results of the command-line interpreter's dispatching, the
parsing, the symbol table lookup (and thus the debug info readers),
the evaluator, and the printer. The tests were fast to run, so I
would run them after pretty much at every point the code could be
expected to behave, during the development process.
As I say, it wasn't quick. But it also means that my next project can
actually have my full attention, because I'm not spreading that
debugging effort across the next year, based on ill-defined,
occasionally reproducible bug reports.
Anyway, what this message comes down to is, "But, but, unit testing! Wow!"