Spent the last week adding preprocessor testcases for every bit of odd behaviour I can dream up. Tidying up the #define directive parser at the moment, removing a malloc performance bottleneck. Zack's just completed a nice tidy- up of the macro expanding code, removing excessive recursive calls. I suspect the current code is now faster than the old cpplib and cccp, certainly there is little reason for it to be slower.
We should be able to scrap support for -traditional (though not -Wtraditional I expect) since we're now bundling an old preprocessor, tradcpp, just for that job. A token-based preprocessor just proved to be too fundamentally different to K+R for the integration to be sustainable, and it was getting in the way.
Cpplib is beginning to look quite clean in most places, and should be not too hard to read. Almost at the stage of being a piece of code to be proud of. A noteable exception is the lexer, which still needs a lot of cleaning up and work on improving performance. Lexers tend to be ugly by their very nature, though.
Hopefully we can soon start to think about front-end integration and pre-compiled headers, which will be fun to work on, and give us some really nice performance improvements. The C and C++ front ends should be able to all-but abandon their existing lexers, save crannies like interpreting numbers and merging adjacent string literals.
In a few days I'm going to be offline for a month or three, so Zack will be working on it alone for a while. I think he's forgotten his Advogato password, though <g>.