The Clang code is huge, and the subject material is complciated, but the code is surprisingly clean and the comments are generally fairly useful. It’s completely awesome compared to hacking on “professional” PHP scripts where the original coders didn’t understand basic concepts or understand how to write useful comments or function names.
Granted, the few tiny patches I’ve sent in to Clang have so far been not quite right, but I’m still learning the guts of how a C compiler works. There’s a big gap between understanding the various effects on code generation between using a short and a long and understanding how the compiler actually generates the the code. For example, the bug I’m currently working on has to do with padding between struct fields, which is something I knew about and something I’ve worked wirh before (reordering fields to reduce the total amount of padding), but making a compiler track that padding, calculate the correct amount based on type and architecture, and so on isn’t something I’ve ever needed to know before. Writing a generic interpreted scripting engine on a custom byte-code VM and writing a standards compliant and system ABI compatible C compiler are worlds apart.
Still, actually learning how Clang works is fairly easy, if time consuming. It’s huge, but it’s well written.
I look forward to submitting a patch for the struct padding issue I’m running into, and maybe even having that patch do everything correctly. Which might be hard, given I can only test on a small handful of architectures (x86, amd64, ppc32).