This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: (PR11207) Macroprocessor discussion


Okay, another stab at reasoning this through. jistone raised the very good point on IRC that we may be considering a macroprocessor which works on already-tokenized data. This would be very different in some respects from the text-based proposal I'd been considering.

Anyhow, the following goals/issues would be necessary to consider for either approach:
- Source Coordinates - correctly preserved for the sake of error reporting.
- Documentation Generation - macros can by used to generate custom docstrings.
- Correct Handling of Brackets - if the preprocessor syntax uses brackets {} or parens (), these interact correctly with any brackets or parens inside the macro parameters
  - by default, the preprocessor respects bracket nesting in the obvious way (brackets are expected to match)
  - the preprocessor knows about the possibility of brackets inside string literals and doesn't attempt to match them
  - there is some kind of e.g. quoting facility for emitting non-matching parens from a macro
- Explicit Macro Invocation - so far we seem to be leaning away from the implicit macro invocation style m4 (and cpp) use, where any identifier is a possible macro invocation. Instead almost all preprocessor stuff, including macro invocations, would be prefixed with a special character such as '%'.

These are just some haphazard notes; I'll come back to them and organize more coherently very soon :)

# Token-Based Approach

Design Challenges
- Source Coordinates - almost trivial to solve due to the tokens being tagged appropriately.
- Documentation Generation - EITHER rig the lexer to retain comments and emit lexed output back as text, OR subsume kernel-doc into the systemtap lexer.
- Correct Handling of Brackets - mostly done for us by the lexer. We still have to handle bracket balancing, EITHER counting bracket depth (and introducing a special mechanism to emit unmatched brackets) OR using some distinct bracketing syntax such as %begin ... %end, 
- Explicit Macro Invocation - consists mostly of the lexer recognizing an addition macro invocation token of the form %ident.

Proposed Syntax
- %define foo(param1, param2, ...) ... %end
- %undef foo
- %foo, %foo(param1, param2, ...)
- /** docstring */
- /*** docstring to attach to previous one */
- %\( , %\) or something for emitting unmatched brackets if necessary

# Text-Based Approach

This would be a macroprocessor with a standalone mode for documentation generation, and an embedded mode to be used as a preprocessing stage before the lexer.

Design Challenges
- Source Coordinates - the macro processor needs to be hooked up directly to the lexer, feeding it a suitable sequence of characters and source coordinate directives.
- Documentation Generation - the macro processor emits text that is consumed by kernel-doc. EITHER the built-in macros need to be defined to magically handle docstrings (as described in a previous email) OR we again use the /*** continuation-docstring notation fche suggested.
- Correct Handling of Brackets - in addition to balancing brackets within a macro invocation, the macroprocessor needs to recognize string constants in order to ignore the brackets within them.
- Explicit Macro Invocation - not too hard or too different from implicit invocation, really.

Proposed Syntax
- %define(foo,param1,param2,...)
- %macro foo(param1, param2, ...) { ... }
- %foo, %foo(...)
- /** docstring */
- /*** this continuation-docstring as well if necessary */
- %\( , %\) or such if necessary

# Misc

Still thinking over where these fit in:
- %( ... %? ... %: ... %) conditionals (these need access to systemtap-internal logic to be really convenient -- perhaps the standalone macroprocessor mode ignores them, while the embedded mode has callbacks into systemtap code?)
- command line arguments $1, $2, ... (on IRC it was brought to my attention that these are effectively macro-substituted in the current systemtap) -- again, handle these by giving the macroprocessor some callbacks when in embedded mode?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]