zw is currently certified at Master level.

Name: Zack Weinberg
Member since: 2000-04-10 19:08:56
Last Login: N/A

FOAF RDF Share This

Homepage: http://rabi.phys.columbia.edu/~zack/

Notes:

"Hello, my name is Zack, and I'm a programmer."

I work on GCC mostly these days, rewriting the C preprocessor. I used to do support/bugfixing for GNU libc, but not anymore.

Projects

Recent blog entries by zw

Syndication: RSS 2.0

It's been awhile...

Not terribly much progress on the cpplib front. Neil's algorithm didn't work right the first time around, got revised, and now there's a new version sitting on my hard drive waiting to be finished up.

I did some work in other areas, like the 'specs' that tell /bin/cc how to run the real compiler (which is hiding in a dark corner of /usr/lib). These are a little language in their own, and not terribly comprehensible - here's a snippet:

    %1 %{!Q:-quiet} -dumpbase %B %{d*} %{m*} %{a*}
    %{g*} %{O*} %{W*} %{w} %{pedantic*} %{std*} %{ansi}
    %{traditional} %{v:-version} %{pg:-p} %{p} %{f*}
    %{aux-info*} %{Qn:-fno-ident} %{--help:--help}
    %{S:%W{o*}%{!o*:-o %b.s}}

With some magic, that turns into an argument vector for one of the programs run during compilation. Not surprisingly, people avoid the stuff as much as possible.

I also stomped the irritating warning bug with built-in functions. You didn't used to get warned if you forget to include <string.h> and use strcpy, because gcc Knows Things about strcpy before it sees any headers. Not anymore. (It was intended to get this right all along, but one if went the wrong way in the mess that is the Yacc grammar for C.)

Neil came up with an ingenious algorithm for expanding macros, which should get the C standard's semantics just right, but avoid having to scan any token more than once. It's remarkably simple to implement, but difficult to describe and not easy to comprehend from reading the code. There's going to be a long comment explaining it somewhere.

Anyway, I've implemented all of it except stringification, which is presenting some difficulties. I'm a wee bit concerned about the way the algorithm interacts with the macro stack as I designed it - we may be losing critical information. But it's late and I'm tired, and it'll probably all make sense tomorrow.

It's been awhile...

I punted the lexer glue and am busily grinding through a rewrite of the macro expander. The goal here is not to forget about the tokenization of the original macro, but preserve as much of it as possible. This will dramatically reduce the amount of data that has to be copied around, reexamined, etc.

So far, object like macros work, and I'm starting on function like macros. [These are the terms the standard uses. It's like this:

#define foo bar           /* object like macro */
#define neg(a) (-(a))     /* function like macro */

Function like macros take more work, because you have to substitute arguments. In the example above, a might be replaced by a big hairy expression.

I took some time out and stomped on about a hundred compiler warnings. We're now sure that string constants are always treated as, well, constant. I've also got as far as the Valley of the Dead in Nethack, which has never happened before (still only about halfway through the game, though).

I spent two days gluing the new lexer into cpp. Then I typed rm -rf * in the wrong directory. Poof, all my work down the drain.

I think I'll break it up into smaller chunks when I do it over again. If that's possible, which it may not be.

To blizzard: If you're going to improve gmon/mcount, please teach it that if there's an existing gmon.out in the working directory, then it should augment that file instead of clobbering it. That way, if you want to profile a program that runs for a short time, you could just run it a few thousand times in a shell loop. Right now you have to do that, plus rename the reports so they all get saved, and then crunch them together at the end. This takes much longer than it has to, and throws your results off because disk cache is wasted on the huge gmon.out files which all have to stay around until the end.

To make this change safely, you should probably save the identity of the executable in gmon.out, and start over if it changes. (This should be done anyway.)

I'd also like to see better kernelside support for profiling. setitimer(2) has a lot of overhead, and the ticks don't come nearly often enough. SVR4 has a profil(2) system call that pushes the histogram updates into the kernel, which gets rid of the overhead but doesn't help with the granularity. Also, I don't think it can handle gaps in the region to be profiled, so your program has to be statically linked.

I'd rather not add system calls. Instead, I envision a pseudo-device which you map several different times, specifying the window of the address space to profile. It can use the high-resolution timer in the RTC to get ticks more often than the normal timer interrupt. Updates happen in the driver, so no more 30% of execution time spent in __mcount_internal.

GCC/i386 has a stupid bug where it clobbers %edx on every function entry, when compiling with profiling. This breaks -mregparm. Okay, that doesn't affect very many people - it still needs to get fixed.

14 older entries...

 

zw certified others as follows:

  • zw certified aoliva as Journeyer
  • zw certified espy as Journeyer
  • zw certified kettenis as Journeyer
  • zw certified plundis as Journeyer

Others have certified zw as follows:

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page