Older blog entries for bonzini (starting at number 13)

Since I did quite well with regex (10-20% speed improvement on average), I've been hooked by the idea of squeezing some performance out of gcc... I've produced a couple o' patches that get about 1% each, 40 more and that's 50% of gcc's speed :-) After seeing how it's written I would not trust it very much as my compiler... Just kidding of course, but it's incredible how much special casing gcc has and how many intricated if's that the compiler has a hard time optimizing. No doubt it's slow.

Ah, I found a cute bug, PR/13931 in gcc's bugzilla database. This is marked as a critical regression for 3.4, maybe you can blame me if the release is delayed. ;->

The thesis is finished and printed, though I did make a couple improvements after I prepared the final benchmarks; the result is that my VM a bit faster than the bytecode interpreter included with GCJ. Method calls are sloooow in GNU Smalltalk, something's got to be done sooner or later about that. :-(

20 Jan 2004 (updated 20 Jan 2004 at 17:29 UTC) »

Lately I have been working on my thesis; I can now run Java benchmarks in GNU Smalltalk even with the just-in-time compiler enabled. I have to print it on Friday and give it in by Wednesday.

Still a long way from releasing 2.2, though. The release cycle is indeed longer than for previous releases, maybe by end 2004. I still have to do some work on security (but maybe I'll just keep it on my hard drive as a patch and drop it, it's quite hard stuff), and since I switched the bytecode set I'd like to do escaping variable analysis and be able to deallocate activation records LIFO more often than now. It's one of the worse things in GNU Smalltalk, really.

Made a lot of work on the GNU regex implementation in glibc 2.3, it is now less braindead than it used to be -- many thanks to Jakub Jelinek and Ulrich Drepper for supporting this, really. For example, it prunes anchored searches, without trying to match ^a all along the strings. It does not care about multicharacter collating symbols (like the Danish AA) in the C locale which has none. It has some microoptimizations, removing two-three parameters from some often called functions and with completely rewritten routines to handle sets of DFA nodes. And it does have some bugfixes, a couple of which caused quadratic bottlenecks. But, it is still much slower than PCRE and grep. Sigh. :-(

sed 4.1 is shooting for POSIX conformance, I'm quite close to it: I have a fix to apply yet that makes line numbers be checked without the need to be enabled/disabled as regex addresses, it fixes case like the following

5,8b
1,5d
You'd expect that it printed lines from 6 on, but the branch prevents the 1,5 address from being disabled and -- surprise -- line 9 is not printed!

I bought two books on Amazon. Code reading is quite cool though most of the tricks it shows are quite basic, because it shows really a lot of code of various quality and prepares you so you have fewer some surprises; I liked the chapters on large projects. Debugging: nine indispensable rules is much better, again it is something everybody is doing sometimes, but it is presented in a convincing way so that you really get to follow the rules. I rarely if ever buy computer books, but I am really satisfied about these two.

I prepared my first gcc patch! It fixes a problem with unsigned int * long long multiplication. I still do not have copyright assignments, but I hope the patch is small enough to be accepted. However, long long still sucks: gcc cannot optimize 64-by-32 division, and performs simply awful register allocation. Bah. I just had an itch to scratch to see if I could fix the one pessimization that seemed easy to attack, and I did. :-)

To complete my previous findings on Libtool, I must say that indeed the libtool people provided me with a solution to prevent the C++/Java tests from appearing in configure (it will be in libtool 1.6 but I backported it) and that I found a way to avoid that the C tests are performed twice (I have contributed it to libtool and it will also be in 1.6).

Also, I just updated to the last Automake 1.8 beta, and the new implementation of aclocal saved over 300k on my tarball, so not everything's wrong in the Autotools after all... :-) (aclocal now includes m4 files that are in the distribution directory using m4_include, and not literally in aclocal.m4).

Yesterday I decided that it would have been cool if GNU Smalltalk was compiled as a shared library. With the support for ELF visibilities that is in gcc 3.3, that should have been possible without much performance degradation (note that the installed VM still is linked statically though). Then I also decided that I should compile the shared library with -fomit-frame-pointer because I do need the register that is lost for the GOT pointer when using position-independent code.

The only sane way that came to mind to supply that flag only for the PIC case, and then only for a particular library, was to use libtool 1.5's tags: they are born to support multiple languages, but the presence of the disable-shared tag suggested that they could be put to such use. To summarize, there is absolutely no documentation on how to use tags, not to mention defining new ones (I wanted my tag to be based on the standard tag for C of course, without duplicating all the code in libtool.m4). I have lost a whole afternoon trying to do this, and now that I finally succeeded, what I came up with is a bunch of awful-looking m4 hacks (that luckily can be encapsulated in a separate macro) and with a 770kb configure that is basically doing exactly the same tests twice!

Ah, and I was forgetting to say that it used to top the megabyte until I found out that it was including in the configure file the stuff for C++, Java, and Fortran 77 without ever executing it. :-)

Did a nice thing recently... I rewrote GNU Smalltalk's bytecode interpreter, the new one has a new bytecode set which supports 192 superbytecodes, so it is 10-50% faster (50% on the P4, finally found a use for that trace cache!!!). I wrote some cute proggies to achieve this: a virtual machine generator, and a superoperator search program that embeds gperf. Sooner or later you'll find it on alpha.gnu.org or even later as part of GNU Smalltalk 2.2. Did I mention that GNU Smalltalk now has GTK+ bindings too? :-)

I just finished a patch for GCC's libffi and I am waiting for comments.

I am also going to add some expected failures to sed and release 4.0.8 with them so that the glibc people will hopefully come and fix them. They made that bad choice of including the new regex matcher in glibc 2.3, now they should mend it. Me and the gawk author are getting a lot of spurious bug reports because of that.

Upload time. I've been quite busy and delayed some uploading (of GNU Smalltalk 2.1.3 and my netcat's second alpha).

This release of GNU Smalltalk has quite a lot of bugfixes, it might even be the last 2.1.x release. The future (2.2) is quite closely related to my Master thesis which I just started. Basically I am going to do three things.

The first is to clean it up so that one can write down a VM specification like Java's. I already did some of this, now the VM specification for example does not depend on having ContextParts which many Smalltalks do not have. I also found a couple of places where things could be made much more cleaner with half an hour's effort. This can only help.

The second is to add security. That's a bigger deal. So far, I have added a mechanism to give `trustedness' attributes to contexts (activation records) so as to make the thing fast, and I wrote a nice little bytecode verifier. I also wrote a small Flex/Bison-based program that automatizes the decoding of bytecodes (the verifier had two places where I was decoding bytecodes, and Smalltalk had other four, and I was starting to hate this code duplication) and which I like very much.

The third is to implement Java on top of Smalltalk. I think the results of this will be quite interesting: I expect big big big slowdown in FP math (Floats are boxed in Smalltalk and primitive in Java), but improvements in code that uses OOP and interfaces a lot. Besides my uni is doing a lot of research in code mobility and stuff like this, where the increased reflection capabilities of Smalltalk can only help.

Last but not least, I noticed that fortune does not produce random fortunes at all. It tends to output the first fortune in the file quite often. So what's better to learn some Perl than to rewrite it? Here is my effort, I'll be very happy to get some comments on it.

1 Jun 2003 (updated 28 Jun 2003 at 14:51 UTC) »

Yesterday I took up again and finished (at least for now) the cleanup of netcat. The result is available here. Among other things that I did is IPv6 support and using poll(2) instead of select(2). Note that there is another renewed version of Netcat which is included in OpenBSD. These two are not related; my version does keep most of the original source code (albeit with renamed variables/functions and updated/edited comments -- actually more in spirit than otherwise), while OpenBSD's is said to be a complete rewrite. OpenBSD's version lacks some features (hexdump, line-by-line, multiplexing) but has some more (AF_UNIX sockets and SOCKS support). I hope to get feedback on this.

I made some more patches to Autoconf, all the preliminary plumbing is in place and I'll push more after 2.58 is released.

Done a coupla cool university projects for the Operating Systems course: a kind of UI for the Unix shell written as a Bourne shell script (cool, with dialog -- and made me discover the fabulous shell syntax "${arr[@]}" :->), and a little thingy doing message passing between parent and child processes (I did it with message queues: jeez, what a braindead API, I have 120 lines out of 350 only for the wrappers around them! another mate did it with AF_UNIX sockets which is indeed a lot easier to do and more readable).

Long time no post... :-)

I've done some more ports of libsigsegv lately (including Cygwin, which was quite cool as you have to deal with the innards of Win32 exception handling...), and GNU Smalltalk is about to reach 2.1.2 (I do patch releases often when big changes are in their infancy, because then I have much bigger exposure) which I'll release as soon as I find time to test the MacOS X port of libsigsegv once more. Bruno Haible in the meanwhile made a lot of Linux ports (m68k, HPPA, completing my IA64 work, and so on).

The netcat project is not going to leave my harddisk soon, but I did remove the gotos and the source code is in a better state. Next step would be to do IPv6, I'll probably do that around the same time when I'll add IPv6 to GNU Smalltalk (that's when I'll fork for 2.2).

I did some real kewl hacking on Autoconf: I started adding usage of shell functions to it!!! I have 1,5 megs of configure scripts in GNU Smalltalk so I am quite interested in this, and I already got up to 25% improvements in the size of configure scripts only by using functions for seven AC_CHECK_ macros (I still have to do some timing); the Autoconf maintainers (Akim Demaille and Paul Eggert) seem to endorse my work and to consider my approach a sane one, so I am quite optimistic even though I'll probably have to learn more m4 to have it accepted.

If accepted, this work might make other changes to Autoconf possible, with ramifications up to an eventual Autoconf 3 release; it would be a major contribution to free software and I am quite proud of it. Initial results can be found in the Autoconf mailing list archives for May 2003. I already sent a good deal of patches that make everything in good shape for adding shell functions and testing the new code.

Jeez, I am going to release GNU Smalltalk 2.1 Real Soon Now!!! I enjoyed HP's TestDrive program, it gives away lots of accounts on Alphas and IA64's and I used that to make it 64-bit clean and fix some lossages due to bad OS libraries (did you know FreeBSD's inttypes.h lacks intmax_t?)... very good.

I also started a pet project. I am going to take netcat 1.10, autoconfiscate it (which I already done), and make the code more readable (which I just started to do). How awful does that code look like! It will surely look too serious when I finish and less H4X0R-ish, but I hope it will be better as a programming example since it shows a good deal of socket tricks -- besides, I think the author exceeded a bit in disseminating the f*** and s*** words in the remarks...

In the meanwhile, my Darwin code has been reviewed & installed into libsigsegv.

I spent Monday afternoon porting GNU Smalltalk to Darwin -- the biggest job being the porting of the libsigsegv library which is used for the generational GC. Gee, how hideous an environment!

It is as far from the standards as they could make it. Everybody has stack_t, they have struct sigaltstack. Everybody includes <signal.h>, they want you to include <sys/signal.h> as well. Some constants are named differently, and so much things are only available on the Mach level -- luckily I found the info on how to decode PPC instructions in the Boehm garbage collector, and some more source code in XEmacs. The port is a 20k patch while other OSes are 2kb at most. Bah. I am going to submit it to Bruno Haible for inclusion in the master libsigsegv sources.

4 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!