Advogato: Blog for akihabara

Putting the finishing touches on a macro expander that uses the new lexer. Like the lexer, it is token-based. The current lexer and macro expander are both text-based.

Getting this to work has been a very frustrating experience. Macro expansion is a hairy and convoluted process, and stringification and token-pasting just add to the confusion. A dense and strangely-worded C99 specification doesn't help :-)

We just have a single token list, and the lexer lexes all tokens in the next logical line into this list. However, a function-like macro invocation can cross multiple logical source file lines. So we don't write over the original token list, and cause chaos, we append to it instead in this case. However, this appending could cause a realloc of the tokens (stored consecutively in memory), and arguments to macros are stored as lists of pointers to the original tokens (they needn't be consecutive), so they need to be fixed up if we realloc. Other things still to do include fixing bogus line numbers in errors and the final output, and squeezing tokens back into 16 bytes for both 32-bit and 64-bit architectures. We need to run it against a macro abuser like glibc to try and turn up missed obscure cases.

Ah, almost forgot, the gem of -traditional support. Not sure what's best there; I think to get everything right would need a separate pre-pass that does traditional macro text splicing. However, this would lose line and column information and just be a maintenance headache. Probably it's best just to support everything we reasonably can in the token-based environment, and drop the really weird stuff like half-strings and macro expansion within strings.

16 Jun 2000 akihabara » (Master)