Putting the finishing touches on a macro expander that uses
the new lexer. Like the lexer, it is token-based. The
current lexer and macro expander are both text-based.
Getting this to work has been a very frustrating
experience. Macro expansion is a hairy and convoluted
process, and stringification and token-pasting just add to
the confusion. A dense and strangely-worded C99
specification doesn't help :-)
We just have a single token list, and the lexer lexes
all tokens in the next logical line into this list.
However, a function-like macro invocation can cross
multiple logical source file lines. So we don't write over
the original token list, and cause chaos, we append to it
instead in this case. However, this appending could cause
a realloc of the tokens (stored consecutively in memory),
and arguments to macros are stored as lists of pointers to
the original tokens (they needn't be consecutive), so they
need to be fixed up if we realloc. Other things still to
do include fixing bogus line numbers in errors and the
final output, and squeezing tokens back into 16 bytes for
both 32-bit and 64-bit architectures. We need to run it
against a macro abuser like glibc to try and turn up missed
obscure cases.
Ah, almost forgot, the gem of -traditional support.
Not
sure what's best there; I think to get everything right
would need a separate pre-pass that does traditional macro
text splicing. However, this would lose line and column
information and just be a maintenance headache. Probably
it's best just to support everything we reasonably can in
the token-based environment, and drop the really weird
stuff like half-strings and macro expansion within strings.