Older blog entries for bonzini (starting at number 8)

Upload time. I've been quite busy and delayed some uploading (of GNU Smalltalk 2.1.3 and my netcat's second alpha).

This release of GNU Smalltalk has quite a lot of bugfixes, it might even be the last 2.1.x release. The future (2.2) is quite closely related to my Master thesis which I just started. Basically I am going to do three things.

The first is to clean it up so that one can write down a VM specification like Java's. I already did some of this, now the VM specification for example does not depend on having ContextParts which many Smalltalks do not have. I also found a couple of places where things could be made much more cleaner with half an hour's effort. This can only help.

The second is to add security. That's a bigger deal. So far, I have added a mechanism to give `trustedness' attributes to contexts (activation records) so as to make the thing fast, and I wrote a nice little bytecode verifier. I also wrote a small Flex/Bison-based program that automatizes the decoding of bytecodes (the verifier had two places where I was decoding bytecodes, and Smalltalk had other four, and I was starting to hate this code duplication) and which I like very much.

The third is to implement Java on top of Smalltalk. I think the results of this will be quite interesting: I expect big big big slowdown in FP math (Floats are boxed in Smalltalk and primitive in Java), but improvements in code that uses OOP and interfaces a lot. Besides my uni is doing a lot of research in code mobility and stuff like this, where the increased reflection capabilities of Smalltalk can only help.

Last but not least, I noticed that fortune does not produce random fortunes at all. It tends to output the first fortune in the file quite often. So what's better to learn some Perl than to rewrite it? Here is my effort, I'll be very happy to get some comments on it.

1 Jun 2003 (updated 28 Jun 2003 at 14:51 UTC) »

Yesterday I took up again and finished (at least for now) the cleanup of netcat. The result is available here. Among other things that I did is IPv6 support and using poll(2) instead of select(2). Note that there is another renewed version of Netcat which is included in OpenBSD. These two are not related; my version does keep most of the original source code (albeit with renamed variables/functions and updated/edited comments -- actually more in spirit than otherwise), while OpenBSD's is said to be a complete rewrite. OpenBSD's version lacks some features (hexdump, line-by-line, multiplexing) but has some more (AF_UNIX sockets and SOCKS support). I hope to get feedback on this.

I made some more patches to Autoconf, all the preliminary plumbing is in place and I'll push more after 2.58 is released.

Done a coupla cool university projects for the Operating Systems course: a kind of UI for the Unix shell written as a Bourne shell script (cool, with dialog -- and made me discover the fabulous shell syntax "${arr[@]}" :->), and a little thingy doing message passing between parent and child processes (I did it with message queues: jeez, what a braindead API, I have 120 lines out of 350 only for the wrappers around them! another mate did it with AF_UNIX sockets which is indeed a lot easier to do and more readable).

Long time no post... :-)

I've done some more ports of libsigsegv lately (including Cygwin, which was quite cool as you have to deal with the innards of Win32 exception handling...), and GNU Smalltalk is about to reach 2.1.2 (I do patch releases often when big changes are in their infancy, because then I have much bigger exposure) which I'll release as soon as I find time to test the MacOS X port of libsigsegv once more. Bruno Haible in the meanwhile made a lot of Linux ports (m68k, HPPA, completing my IA64 work, and so on).

The netcat project is not going to leave my harddisk soon, but I did remove the gotos and the source code is in a better state. Next step would be to do IPv6, I'll probably do that around the same time when I'll add IPv6 to GNU Smalltalk (that's when I'll fork for 2.2).

I did some real kewl hacking on Autoconf: I started adding usage of shell functions to it!!! I have 1,5 megs of configure scripts in GNU Smalltalk so I am quite interested in this, and I already got up to 25% improvements in the size of configure scripts only by using functions for seven AC_CHECK_ macros (I still have to do some timing); the Autoconf maintainers (Akim Demaille and Paul Eggert) seem to endorse my work and to consider my approach a sane one, so I am quite optimistic even though I'll probably have to learn more m4 to have it accepted.

If accepted, this work might make other changes to Autoconf possible, with ramifications up to an eventual Autoconf 3 release; it would be a major contribution to free software and I am quite proud of it. Initial results can be found in the Autoconf mailing list archives for May 2003. I already sent a good deal of patches that make everything in good shape for adding shell functions and testing the new code.

Jeez, I am going to release GNU Smalltalk 2.1 Real Soon Now!!! I enjoyed HP's TestDrive program, it gives away lots of accounts on Alphas and IA64's and I used that to make it 64-bit clean and fix some lossages due to bad OS libraries (did you know FreeBSD's inttypes.h lacks intmax_t?)... very good.

I also started a pet project. I am going to take netcat 1.10, autoconfiscate it (which I already done), and make the code more readable (which I just started to do). How awful does that code look like! It will surely look too serious when I finish and less H4X0R-ish, but I hope it will be better as a programming example since it shows a good deal of socket tricks -- besides, I think the author exceeded a bit in disseminating the f*** and s*** words in the remarks...

In the meanwhile, my Darwin code has been reviewed & installed into libsigsegv.

I spent Monday afternoon porting GNU Smalltalk to Darwin -- the biggest job being the porting of the libsigsegv library which is used for the generational GC. Gee, how hideous an environment!

It is as far from the standards as they could make it. Everybody has stack_t, they have struct sigaltstack. Everybody includes <signal.h>, they want you to include <sys/signal.h> as well. Some constants are named differently, and so much things are only available on the Mach level -- luckily I found the info on how to decode PPC instructions in the Boehm garbage collector, and some more source code in XEmacs. The port is a 20k patch while other OSes are 2kb at most. Bah. I am going to submit it to Bruno Haible for inclusion in the master libsigsegv sources.

28 Mar 2003 (updated 28 Mar 2003 at 10:43 UTC) »

Time to write something new...

I have done some work on sed lately, there were some bugs in bad interactions between s///NUM and the LHS being possibly empty. sed has a lot of corner cases like this where it is supposed to Just Work, then working around a strange behavior requires one to work around the correct behavior where the strange behavior was right and so on. You easily end up with 200 lines of ifs!

Besides, sed is now multibyte-clean and I'm ready to release 4.0.7 (the final 4.0.x release) and 4.0a (a first alpha for 4.1). I don't have much time to do them and I have more priorities such as fixing up that Assembly memcmp...

Today I'll spin a prerelease for GNU Smalltalk 2.1! I hope to release around Easter. I really like this release. I took time to insert many (optional) sanity checks, so notwithstanding some big architectural changes, it turns out that it is really stable and they actually fix bugs in the old code: especially the overhauled Processes, which were meant to allow debugging, but did fix some nasty SIGSEGVs in the browser. The browser grew up to something fast, stable and usable enough for production work... even though I am mostly a vi guy (elvis, not vim!) I do use it sometimes.

I'm also very pleased with the garbage collection, it works like a charm -- yesterday I suspected a GC bug and was ready to spend an hour staring at the thousands of lines that GC outputs in debugging mode, but it turned out that the bug was actually in years-old code and not in the GC which is only three months old! That's the code that computed stack heights to find out how big to make heap-allocated activation records, and this code turned out to be very bad (I must say that I wrote it about 3 weeks after I picked up C...) so I made a general rewrite which includes more sanity checks yet it is even faster.

Ah, BTW, the XML idea that I wrote about in my first diary entry was developed to a nice non-validating parser written using big big big regular expressions (like the RFC821 parser in Mastering Regular Expressions), which is quite fast and very object oriented (given that it is PHP). I wrote some very nice SAX handlers, including the ability to write XML and HTML, and to save and replicate SAX events which is the basis for templates. Maybe some day I'll finish it, in the meanwhile we've tried out the parser and its companion classes with a colleague at work, and he liked it a lot.

I'm going to quit this work in a week (after we finish a big big big project on Monday), then I'll start my thesis. I have not heard anything from the professor, but I hate to pick up the phone as I don't want to be unkind... It should be about adaptively optimizing Smalltalk bytecode; it sounds like a big job, I have to redo some parts of the GNU Smalltalk JIT and add polymorphic inline caches, and then write a lot of Smalltalk code for the optimizers, but it can be done and looks exciting (especially when compared to web programming...).

That's it for today.

Done some work on GNU Smalltalk. The maintainer for the Debian package has ported it to MinGW and I gave him a hand; this was also a good occasion for restructuring the OS dependent stuff and providing neat encapsulations for anonymous mmaps.

I also got several patches for the graphical development environment (aka class browser), and put the FTP clients back in ready-for-release shape.

So now I am a Master... well I am certainly but honored I frankly don't believe I deserve it. Yes, I maintain sed which is after all a fundamental piece of a GNU/Linux system, but this is minor work on a fundamentally mature program; and while I strive to make it more mainstream GNU Smalltalk is still not ready in many fields (of course it is a very nice program and I urge you to contribute!). As to my other contributions they are just for fun... This portrait makes me a good Journeyer, a title of which I'd be very proud -- you know, it's like seeing my passion for programming in general and free software in particular well recognized.

Maybe I am taking the advogato metric too seriously but I wouldn't mind if people who rated me Master downgrade me. :-)

24 Feb 2003 (updated 24 Feb 2003 at 14:19 UTC) »

I just got certified as one of those people who after all "make free software happen", and I also finally found where you create a diary entry... Dunno how often I'll post here, anyway here is the current situation of my life as a Journeyer (and a little more).

I just forked the 4.1 branch of sed (no release made yet) and will release the hopefully last 4.0.x version soon. It's very nice to see that finally some distros including Red Hat and Debian are adoping the versions of sed that I maintained. 4.1 will be more POSIX compliant and more multibyte-friendly as well.

On the GNU Smalltalk side, I am about to release 2.1 but first I will have to sort out some Cygwin issues with the Tk bindings. I already have lots of ideas for 2.2, including providing a less weird (more declarative) syntax inspired by Dave Simmons' S#, providing a better graphical development environment, more JIT compiler work, and so on; but I guess that, with the new garbage collector, things will take a while before I can stabilize the 2.1 branch and fork it. Well, if I never took a break I would never do a stable release.

I also worked on the 2.0 branch to do some build tool upgrades on behalf of Debian. Luckily many bug reports for 2.0 have been closed, maybe GNU Smalltalk will finally get into testing distributions!

Not much free time to do all this though. Last week was really a hell, being out of home 16-18 hours a day (the rest being spent sleeping), constantly split among volunteering, exams and work, with free software and (more important) social life getting too much in the background. Things start to look better now even though university lessons restarted.

Previously I had done some tangential work on a few other projects, including the very interesting gnulib which is a collection of common configure snippets and associated source files regarding the LIBOBJs (ranging from regex to replacing missing or broken functions). I picked from GNU Smalltalk an implementation of poll(2) which is based on select(2) (poll is often preferrable because it scales better; select sucks when you have loads of file descriptors), and an implementation of the C99 long double math functions (such as cosl). Some real mavens work on this project, including Bruno Haible which I hold in very high respect, and it was very gratifying to see my contributions accept with little or no objections!

I also have to find the time to polish my assembly language implementation of memcmp and submit it to the inflexible glibc maintainer, Ulrich Drepper. Did some work on regex lately as part of my sed work, and I get his typical response This is utter nonsense on fewer posts than I used to... rather I get This and this is wrong more and more often, and finally after much sweating Thanks, I applied the patch.

Have been thinking of a JSP implementation for PHP at work. In five work days I wrote and tested about 3000 lines of PHP and JavaScript code for a web application (a database with 15 tables and I have not reached half the work yet!) which was mostly done by cut&paste because I was in a hurry. Things look the same but they aren't: you have the section which has attachments to each record of the table; the section which lies in the middle of a many-to-many relationship; the section which has more complicated user interaction and needs session; the section which has combo boxes; and so on. Net result: tiny bits of code always change between two pages and the code is badly written and unmaintainable; having really powerful templating and tag library features would surely help -- and PEAR's templates don't count as such. Of course the problem is where I find the time, even at work which is where I use PHP, to do so.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!