Adding a Source Language to GDB

This page is a high-level guide to adding support for a new source language to GDB. This is not too difficult, and one nice thing is that you can do the work in pieces, gradually adding more functionality.

Language definition

The first step is to add an entry for the new language to enum language in gdb/defs.h.

Next, make a new instance of struct language_defn (see gdb/language.h). The best approach is to make a new "lang.c" file, named after your language (e.g., ada-lang.c, c-lang.c); and then to start the new language definition as a copy of the C definition, replacing only the first three elements (la_name, la_natural_name, and la_language). Then you will refine the definition as you write more components.

Add an initialization function to your "lang.c" file to register the new language definition with add_language.

Edit init_filename_language_table in gdb/symfile.c to add any language extensions that should be associated with your new language.

At this point, you should be able to start gdb and use set language to change to your new language.

Update the DWARF reader

Because most GDB targets use DWARF, this task should be considered an early must-do. Change dwarf2read.c:set_cu_language to translate the DWARF code for your language to the enum value you added in the previous step. You may need to edit include/dwarf2.h (which is canonically maintained in GCC) to add the new value. If your language doesn't have a language code yet, you can add DWARF producer sniffing in read_file_scope.

You may want to update the DWARF reader some more in a later step.

Next steps

There are many choices of what to do next. Many of them can be done in any order. This guide presents one possible sequence, leaving the most difficult tasks for last. Most of the remaining tasks involve implementing one or more methods from struct language_defn.

There are also some tasks that you may or may not have to do, depending on your language. These are covered in the very last section.

Correct scalar fields

struct language_defn has several scalar fields -- as opposed to function pointers, or pointers to other tables. Go through each of these and make sure that the value in your new definition is correct for the language you are implementing.

Add a character printer

Implement the la_printchar and la_emitchar methods.

Add a typedef printer

Implement the la_print_typedef method.

Add a type printer

Implement the la_print_type method. Ideally (for your users), this should be able to display any type that would be used in programs written in the new language. Because programs can be written in multiple languages, and because GDB doesn't record the language of a type, if your printer sees a type it doesn't recognize, it is usually best to delegate it to c_print_type.

Add a val printer

Value printing is split into two phases -- value printing, which tries to print a struct value, and "val" printing, which essentially tries to print a value that has been decomposed into its constituent parts. Normally the generic value printer is fine; and so you will probably only need to implement a val printer.

Many values can be printed nicely using generic_val_print. It can be customized to some degree using an instance of generic_val_print_decorations. However, the generic printer cannot handle all types, for example TYPE_CODE_STRUCT. Your printer should handle these.

There are a number of print options that your printer should handle, in order to integrate nicely into GDB. See struct value_print_options, and the manual, for details.

One question you should consider is which types should have special code in the val printer. One decent approach is to have GDB know how to print values that correspond to types that are specially treated (or known) by the compiler. Then, delegate the printing of other types to Python pretty-printers that are shipped with the standard library.

Implement symbol lookup

The language method la_lookup_symbol_nonlocal is used by GDB when searching for a name. In particular, GDB calls this method after searching the various function-local blocks (and after searching this, if you've defined la_name_of_this), and before searching file-scoped and global blocks. This provides a way for your language to handle more complex name lookup, such as searching any associated namespaces or module imports.

Write the documentation

Your language should have a node in the manual, near the other source language nodes. You should also write a NEWS entry.

Write tests

Porting the test suite can be difficult, depending on the specifics of your language. See gdb/testsuite/lib/future.exp for a good spot to add hooks for your language.

It's a good idea to run coverage tests while writing your test suite, to ensure your new code is sufficiently tested.

Add a demangler

If your language mangles symbol names, say to include type information, then you will want to teach GDB how to demangle these names. This has a few steps:

1. The demangler implementation itself should go in libiberty, alongside the other demanglers there. The demangler test suite is also here.

2. You should update c++filt to recognize the name of the newly-added demangling style as an argument to the --format flag.

3. Update the la_demangle field in your language definition to call the new demangler.

4. Update gdb/symtab.c:symbol_find_demangled_name to handle your language.

Create the expression parser

If you use a yacc-based parser, it should reside in a file named after your language, and ending in "-exp.y". Since we can't depend upon everyone having Bison, and yacc produces parsers that define a bunch of global names, GDB provides a header file, yy-remap.h, which can be used to rename symbols that might possibly conflict.

Routines for building parsed expressions into a union exp_element list are in parse.c.

Due to the way the GDB CLI works, expression parsers must follow a few rules in addition to those required by the source language:

Add any evaluation routines, if necessary

If you need new opcodes (that represent the operations of the language), add them to std-operator.def. Add support code for these operations in the evaluate_subexp function defined in the file eval.c. Add cases for new opcodes to prefixify_subexp, operator_length_standard, print_subexp_standard, and dump_subexp_body.

You can also make the new operators specific to your language, by writing local variants of these functions (that delegate most cases to the standard versions); and by adding a struct exp_descriptor to your language implementation. You can also override standard operators this way -- most commonly by redefining the semantics of a particular operator, but also even changing the layout in struct expression.

"Maybe" tasks

There are some tasks that you may or may not have to do, depending either on how your language works, or how close it is to some language that GDB already supports.

It's not unusual to have to modify the DWARF reader beyond merely adding support for a language tag.

It's possible your language may even require deeper changes to GDB. Whatever those might be, they are outside the scope of this document.

None: Internals Adding-a-Source-Language-to-GDB (last edited 2016-05-10 04:07:19 by TomTromey)

All content (C) 2008 Free Software Foundation. For terms of use, redistribution, and modification, please see the WikiLicense page.