<?xml version="1.0"?>
<rss version="2.0.">
  <channel>
    <title>Advogato blog for akihabara</title>
    <link>http://www.advogato.org/person/akihabara/</link>
    <description>Advogato blog for akihabara</description>
    <language>en-us</language>
    <generator>mod_virgule</generator>
    <pubDate>Fri, 25 Jul 2008 14:57:39 GMT</pubDate>
    <item>
      <pubDate>Wed, 12 Jul 2000 05:59:31 GMT</pubDate>
      <title>12 Jul 2000</title>
      <link>http://www.advogato.org/person/akihabara/diary.html?start=3</link>
      <guid>http://www.advogato.org/person/akihabara/diary.html?start=3</guid>
      <description>Spent the last week adding preprocessor testcases for every 
bit of odd behaviour I can dream up.  Tidying up the 
#define directive parser at the moment, removing a malloc 
performance bottleneck.  Zack's just completed a nice tidy-
up of the macro expanding code, removing excessive 
recursive calls.  I suspect the current code is now faster 
than the old cpplib and cccp, certainly there is little 
reason for it to be slower.

&lt;p&gt; &lt;p&gt; We should be able to scrap support for -traditional 
(though 
not -Wtraditional I expect) since we're now bundling an old 
preprocessor, tradcpp, just for that job.  A token-based 
preprocessor just proved to be too fundamentally different 
to K+R for the integration to be sustainable, and it was 
getting in the way.

&lt;p&gt; &lt;p&gt; Cpplib is beginning to look quite clean in most places, 
and 
should be not too hard to read.  Almost at the stage of 
being a piece of code to be proud of.  A noteable exception 
is the lexer, which still needs a lot of cleaning up and 
work on improving performance.  Lexers tend to be ugly by 
their very nature, though.

&lt;p&gt; &lt;p&gt; Hopefully we can soon start to think about front-end 
integration and pre-compiled headers, which will be fun to 
work on, and give us some really nice performance 
improvements.  The C and C++ front ends should be able to 
all-but abandon their existing lexers, save crannies like 
interpreting numbers and merging adjacent string literals.

&lt;p&gt; &lt;p&gt; In a few days I'm going to be offline for a month or 
three, so Zack will be working on it alone for a while.  I 
think he's forgotten his Advogato password, though 
&amp;lt;g&amp;gt;.</description>
    </item>
    <item>
      <pubDate>Tue, 4 Jul 2000 07:14:09 GMT</pubDate>
      <title>4 Jul 2000</title>
      <link>http://www.advogato.org/person/akihabara/diary.html?start=2</link>
      <guid>http://www.advogato.org/person/akihabara/diary.html?start=2</guid>
      <description>Finally got the new expander and lexer live today.  A lot of
cleanup and optimisation remains to be done, but the
immediate priority is comprehensive testsuites so we can be
sure not to introduce regressions when improving the code
base.

&lt;p&gt; -traditional is not supported fully at present, but we're
working on a solution.</description>
    </item>
    <item>
      <pubDate>Fri, 23 Jun 2000 04:53:48 GMT</pubDate>
      <title>23 Jun 2000</title>
      <link>http://www.advogato.org/person/akihabara/diary.html?start=1</link>
      <guid>http://www.advogato.org/person/akihabara/diary.html?start=1</guid>
      <description>At last, the new macro stuff is nearly done, thanks to some 
work by Zack yesterday.  We bootstrap and pass the tests in 
the testsuite, and are more precise about corner cases than 
before.  Just -traditional stuff to go, and we should be 
able to apply it to CVS.  If you use non-ISO stuff like the 
GNU ## extension to delete the previous token, or token 
pasting to get a non-token (remember, we're grown-up and 
token-based now) you'll get warnings telling you to clean 
up your act.&lt;p&gt;

&lt;p&gt; A lot of ugliness remains, but that will be easier to clean 
once we're happy we've got working code and binned the old 
text-based expander.  Many areas are much cleaner, for 
example the three places (#assert, #unassert and #if/#elif) 
that need to parse assertions all use the same code now, 
rather than having their own slightly different version to 
handle the slight differences of syntax.&lt;p&gt;

&lt;p&gt; The token-based macro expansion process is quite simple in 
concept, but the reality is a bit messy and hard to 
understand from the source code.  I'll try to clean it up 
and comment it once we're sure it's working, and have it in 
CVS.&lt;p&gt;

&lt;p&gt; After -traditional, the next stage is probably to get cpp 
re-integrated with the front ends, as a library and not a 
separate process.  This will cut out a lot of overhead: an 
extra exec(), writing out the preprocessed file, the front 
end reading it in again, and re-tokenising.</description>
    </item>
    <item>
      <pubDate>Fri, 16 Jun 2000 08:58:07 GMT</pubDate>
      <title>16 Jun 2000</title>
      <link>http://www.advogato.org/person/akihabara/diary.html?start=0</link>
      <guid>http://www.advogato.org/person/akihabara/diary.html?start=0</guid>
      <description>Putting the finishing touches on a macro expander that uses 
the new lexer.  Like the lexer, it is token-based.  The 
current lexer and macro expander are both text-based. &lt;p&gt;

&lt;p&gt; &lt;p&gt; Getting this to work has been a very frustrating 
experience.  Macro expansion is a hairy and convoluted 
process, and stringification and token-pasting just add to 
the confusion.  A dense and strangely-worded C99 
specification doesn't help :-)&lt;p&gt;

&lt;p&gt; &lt;p&gt; We just have a single token list, and the lexer lexes 
all tokens in the next logical line into this list.  
However, a function-like macro invocation can cross 
multiple logical source file lines.  So we don't write over 
the original token list, and cause chaos, we append to it 
instead in this case.  However, this appending could cause 
a realloc of the tokens (stored consecutively in memory), 
and arguments to macros are stored as lists of pointers to 
the original tokens (they needn't be consecutive), so they 
need to be fixed up if we realloc.  Other things still to 
do include fixing bogus line numbers in errors and the 
final output, and squeezing tokens back into 16 bytes for 
both 32-bit and 64-bit architectures.  We need to run it 
against a macro abuser like glibc to try and turn up missed 
obscure cases.

&lt;p&gt; &lt;p&gt; Ah, almost forgot, the gem of -traditional support.  
Not 
sure what's best there; I think to get everything right 
would need a separate pre-pass that does traditional macro 
text splicing.  However, this would lose line and column 
information and just be a maintenance headache.  Probably 
it's best just to support everything we reasonably can in 
the token-based environment, and drop the really weird 
stuff like half-strings and macro expansion within strings.</description>
    </item>
  </channel>
</rss>
