Older blog entries for rmathew (starting at number 33)

17 Jul 2004 (updated 17 Jul 2004 at 18:43 UTC) »
GCJX
I started tinkering with Tom's GCJX project this weekend. On a cursory perusal, it definitely seems more accessible to ordinary mortals than the current GCJ front end. It is written in C++, a language that I am not too comfortable with, but that should not be an insurmountable problem.

I built and ran GCJX against the current GCC mainline. I normally do not actually install a newly built mainline GCC snapshot after the GCC build process is complete, so using this compiler is a bit different than normal - I used the following script to build GCJX using such a GCC:

#!/bin/sh

# Build GCJX

GCC_SRC_DIR=/extra/src/gcc/gcc GCC_BLD_DIR=/extra/src/gcc/build GCC_BLD_TGT=i686-pc-linux-gnu

BLT_CXX="$GCC_BLD_DIR/gcc/g++ -B$GCC_BLD_DIR/gcc/ \ -I$GCC_BLD_DIR/$GCC_BLD_TGT/libstdc++-v3/include \ -I$GCC_BLD_DIR/$GCC_BLD_TGT/libstdc++-v3/include/$GCC_BLD_TGT \ -I$GCC_SRC_DIR/libstdc++-v3/libsupc++ \ -L$GCC_BLD_DIR/$GCC_BLD_TGT/libstdc++-v3/src/.libs"

make CXX="$BLT_CXX" typedefs.hh.gch

make CXX="$BLT_CXX" want_pch=yes

Compiling C++ programs that use STL with GCC can be quite slow, so I am glad that newer GCC versions have pre-compiled header (PCH) support. The script above uses PCH to build GCJX - on my little machine, the total build time is reduced from 12m 2s to 6m 32s and an additional 11s to build the PCH itself. That's quite an improvement!

Since GCJX does not come with its own class libraries, I used GNU Classpath 0.10 sources appropriately updated (create "gnu/classpath/Configuration.java" and copy over classes under "vm/reference/" into the top folder) for use with GCJX. After this, I could compile simple programs as:

lextest /path/to/classpath-0.10 . -- HelloWorld.java
This parses the given file and all files from the Java runtime (quite a lot) needed for a complete closure and writes out class files (and CNI header files) to "/tmp/gcjx-out" (which I had to create beforehand). Note that I had to update LD_LIBRARY_PATH to point to the folder containing the libstdc++.so file from the GCC build folder as "lextest" depends on this library.

Compiling so many of the Java runtime classes over and over again every time I want to compile a source file is a little wasteful, not to mention quite slow, so I started giving the path to the libgcj build folder from the GCC build folder and this speeded things up considerably.

My next task was to check GCJX against the Jacks testsuite. I had to create an appropriate Jacks "gcjx_setup" file. Mine looks like this:

set JAVAC /extra/src/gcjx/lextest
set JAVA /extra/src/gcc/build/i686-pc-linux-gnu/libjava/gij
set JAVA_FLAGS "-mx=64m"
set JAVA_CLASSPATH ""
set JAVAC_FLAGS {/extra/src/gcc/build/i686-pc-linux-gnu/libjava . -- -jacks}
set JAVAC_DEPRECATION_FLAG "-deprecated"
set tcltest::testConstraints(assert) 1
"-jacks" is a temporary flag that causes GCJX to suppress most of its pedantic warnings and causes it to write out classes to the current folder (instead of "/tmp/gcjx-out").

I got 4302 PASS-es, 541 FAILs and 76 UNTESTED out of a total of 4919 Jacks testcases. This is not as good as what Tom reports, so I have to see what has gone wrong - but in any case, it is much better than what the current GCJ front end gives.

Great work Tom!

I want to see if I can add JAR/ZIP reading support to GCJX. This looks doable and Tom already has most of the framework ready.

Bytecode Verification
I was able to track down and fix PR 5537 with a simple patch, but fixing it exposed other problems in the front end bytecode verifier that cause failures in the libjava testsuite and in the Mauve verifier testsuite.

So now I need to go after these bugs before I can submit a patch.

~sigh~
(Why do such things always happen to me?)

I found an interesting presentation (PDF) by Gilad Bracha (a "Computational Theologist" in Sun) on bytecode verification. I found it interesting because he says that Sun went with load-time bytecode verification to reduce run-time verification overheads and had to introduce strict definite (un)assignment as a mandatory feature of the language to be able to carry out proper type inference during verification! The other interesting bit was that they went with type inference (instead of type checking) to reduce the space used by class files at the cost of increased complexity, memory usage and speed - they later found out that type checking would have meant faster and simpler code at the cost of only a little extra space (5-10%)!

"Premature optimisation is the root of all evil!"

By the way, JDK 1.5 will have class data sharing to reduce startup times. So should libgcj use prelinking?

Bootstrapping
My understanding of the bootstrapping process.

Emaciated, yet Emancipated
Linux Weekly News (LWN) is an online resource of superb quality - dare I say it is The Economist of the Free Software journalism world - well researched and mostly thorough articles, quite up to date and fairly balanced content and a simple yet effective layout. I have enjoyed it for quite some time now and would highly recommend it to anyone who cares to listen.

For some time now however, the "crown jewel" of LWN, "The LWN.net Weekly Edition" has been immediately available only to subscribers (though it is made freely accessible to everyone else after a week) as even the LWN guys need to feed themselves, just like everyone else. So I decided to subscribe to it, if nothing to show my support for these wonderful guys, but found out that except for the "starving hacker" level of subscription, everything else would be a bit of an indulgence for me when the fact that a US Dollar is around 45 Indian Rupees is considered. So that is what I opted for and now I get all the LWN stuff as soon as it is published. Cool!

GCC and C++
The GCC Steering Committee published its decision (and that of RMS) on Nathan Sidwell's proposal to rewrite GCC in a useful subset of C++. In short, they have turned it down though they are open to the idea of making GCC sources compile correctly with a C++ compiler. What I didn't quite get was this:

In addition, RMS stated that the use of C++ was unacceptable for the GNU Project, at least for programs that are presently written in C.
This is quite weird and without more of the context in which this was uttered or clarifications, I don't understand this at all. In particular, this doesn't bode well for Tom's rewrite of the GCJ front end in C++, which by the way, is coming along pretty well for a one man project.

RMS also considers GCJ's capabilities to compile Java bytecode bad, considering it a way for malicious vendors to bypass GCC's GPL to use it as a compiler backend!

Whatever...

Contributing to GCC
Roger Sayle sums it up nicely. I agree with him 100%. Even though I am a fringe and quite erratic contributor to GCJ (much less the core GCC), it has helped me immensely as a software developer. To paraphrase Calvin's Dad, "It helps build character".
"Things a Computer Scientist Rarely Talks About"
My friend Ananth has just gifted me a copy of "Things a Computer Scientist Rarely Talks About", a very different book by Donald Knuth. Having admired and being in awe of most of Knuth's work and being an atheist, this book should make some interesting reading for me.

Other recently acquired books on my "reading pending" list (besides almost the whole of my library) include "The Art of Unix Programming" by Eric Raymond and "Five Point Someone" by Chetan Bhagat, a novel about life in an IIT.

C++
Nathan Sidwell is on the laudable mission of introducing statically typed trees to GCC replacing the much abused and overloaded "tree" structure. Some people were of the opinion that it might be much better to just write a front end in C++ or at least a reduced subset of C++ ("C with classes"?). Some though were a bit apprehensive of the idea, saying that once you let in a bit of C++, there will really be nothing stopping developers from bringing in all of C++ and then you'd find yourself in a nightmare of trying to support all those C++ compilers on all supported platforms that differ sometimes subtly, sometimes not quite so in their interpretation of the language standards.

Tom has rewritten the GCJ front end in C++ and has done awesomely well and quite a lot in such a short period of time.

Looks like there's no escaping it now...

But...

I have never been able to bring myself to like this language. There is no one thing that I can point to nor can I write detailed and knowledgeable analyses like this or this, though I do agree completely with a lot of points raised in these.

Perhaps it is because of the boring and overly complex book on the language by Stroustrup that happened to be my first introduction to C++ (in a sharp contrast to that masterpiece of brevity, accessibility and utility that was written by Kernighan and Ritchie). Perhaps it is because of the mind-scarring error messages spewed by the versions of the IBM xlC compiler on AIX containing literally two to three lines of mangled names of instantiations of templates. Perhaps it is because of the fact that GCC is so slow compiling programs using the STL. Perhaps it is because I found that as a programmer I had to know so many things about the nuances of the language just to be able to program anything with a semblence of confidence.

I really like object-oriented programming, but I find Java to be way better and much simpler in expressing myself than C++. I would not at all mind if GCC were to be rewritten in Java, though I'm not at all a Java fanboy.

I just want a natively compiled, simple programming language without many "gotchas", that lets me easily express decent object-oriented designs and that comes with a standard and fairly comprehensive runtime library. Perhaps I should take a dekko at Objective C.

(See? This is what happens when Advogato comes back up online after a long time.)

Advogato Is Back!

Finally!

My inane ramblings can now resume.

Mwahahahaha!!!

Small World
Niraj Kumar, who contributed the support for FreeBSD's UFS2 filesystems to the 2.6 series Linux kernel is also from Bangalore!

Not only that - he works for Oracle India as well and in fact sits in the same building as I do!

I am yet to meet him though. :-/

GCJ and Definite [Un]Assignments
I spent most of my free time this weekend trying to resolve some of the issues with definite [un]assignment in GCJ as specified by the JLS.

I just wanted to be able to see zero new FAILs with Jacks, but that goal proved to be quite elusive - I fix one bug and that uncovers problems elsewhere, I fix some of those bugs and they uncover some more, ad nauseam - I had to give up after a certain point and just came up with something that is the least disruptive.

All this effort on something that I do not even agree completely with! I mean, this part of the JLS must be one of the shadiest - Sun has just converted into a specification whatever they felt they could reasonably implement in their own compiler, never mind if someone else can come up with a more thorough code flow analysis and give out less stoopid warnings. Sheesh!

www.advogato.org server woes
Even the Advogato server has been having problems lately! The result is that I've not been able to post diary entries for around a week now. (Not that I have anything insanely useful or profoundly interesting to say, but still...)

The following is what I intended to post around a week ago:

GCC Gets Tree-SSA
After almost 4 years of development, the tree-ssa project has finally been merged into the GCC mainline!

The Static Single Assignment (SSA) form enables all sorts of funky optimisations to be implemented in GCC, making it possible for it to compete effectively with commercial optimising compilers.

Thanks in large part to Jeff Sturm and Andrew Haley, GCJ starts off as a first class citizen in this new avatar of GCC, that too with no libjava testsuite regressions. In fact, the Jacks testsuite now shows 30 unexpected successes (XPASSes) with GCJ and only 5 new failures (FAILs)!

The GCC bootstrap time seems to have regressed by around 20% percent, but one should not forget that quite a bit of new code has been added, not to mention the increased number of Trees that are created and the now redundant RTL optimisers.

And oh, by the way, say "Hello!" to Tree Browser!

ELF DSOs
"How to Write Shared Libraries" (PDF) is a paper by Ulrich Drepper that should be mandatory reading for every serious UNIX programmer. If nothing else, it lets one throw terms around like GOT and PLTs at newbies! ;-)

12 May 2004 (updated 12 May 2004 at 06:05 UTC) »
gcc.gnu.org/sources.redhat.com overseers list
Finally there is a backup for the overseers list for this very important server for Free software.

tree-ssa into mainstream GCC
And so it begins...

24 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!