So today I'm working on token lists. CPP has always been strictly textual, but C the language is based on tokens. Now CPP is going to have the same concept.
The basic idea is that we scan one line at a time and convert it to tokens. These are as close as possible to the C front end's concept of tokens. We try to make this have no context sensitivity, and we can do it except for directive lines. Then macro expansion and directive processing happens on the pretokenized line, which produces another line. Then we convert that back to text and feed it to the compiler. (Longer term, the converting back to text won't happen either, but one step at a time.)
But for today, all I implemented was the basic data structure and helper functions. The next step will be to wrap the existing lexer in this - that will hopefully happen by Wednesday. Then I will begin bashing on the macro expander.
