Older blog entries for sab39 (starting at number 36)

redi: An improvement on your patch would be the ability to do <diary>personname</diary> and have it automatically figure out what their most recent diary was at the time of posting. Since the majority of diary references are to the latest entry anyway (via recentlog), that would save a lot of hassle of looking up what entry number the diary is you're commenting on. That would make <diary> as easy to use as <person>, but without the nasty side effect that the links become outdated over time.
31 Jul 2002 (updated 31 Jul 2002 at 12:59 UTC) »

Fixed the last (known at the time!) regression bug in japicompat and released japitools-0.8.3. Japicompat screams now. Unfortunately just discovered that it does have one significant regression - if the original file contains entries that come after the last entry in the "new" file, japicompat hangs at infinite CPU. So I'll have to fix that and do a quick 0.8.4...

Update about 15 mins later

Did that... now it screams and works...


Got the new and efficient japicompat to give me results today. The results weren't exactly right, but I went through all the differences compared to the output of the old code and found only one specific bug. I need to figure out why it thinks that java.security is missing entirely instead of just missing a few classes and methods. I suspect it's to do with the fact that the first class it encounters is one of the missing ones.


The rot13 screen hack did it's job for now. I'll be using it again... :)

fxn: I'm glad that emacs supports the feature already, but you can have my vim when you pry it from my cold dead fingers ;)


Documentation is the hardest job to do, especially when you're documenting a horrible mess of specific business rules and have to get it done in a short timespan...

tk: I'm afraid that during the advogato outage I already came up with a solution using off-the-shelf software. Specifically, GNU Screen. It turns out that screen has a little-known and poorly (in fact incorrectly) documented option that is designed to allow substituting international characters on terminals that don't support particular characters. The option is general-purpose enough to allow substituting any character for any other, and therefore adding the following line to my .screenrc produces exactly the effect I wanted (it must be one line with no spaces between the quotes):

termcapinfo xterm 'XC=B%,an,AN,bo,BO,cp,CP,dq,DQ,er,ER,fs,FS,gt,GT,hu,HU, iv,IV,jw,JW,kx,KX,ly,LY,mz,MZ,na,NA,ob,OB,pc,PC,qd,QD,re,RE,sf,SF,tg,TG,uh,UH, vi,VI,wj,WJ,xk,XK,yl,YL,zm,ZM,16,27,38,49,50,61,72,83,94,05'

I was hoping that advogato would come back quickly so I could post the solution before anybody wasted too much time solving it. I hope that you didn't put too much effort into that program :)

bgeiger: I'm going to learn to read rot13, that's how :) I started using this utility on the train last night, and by the end of the train ride this morning I already found that I no longer need to actually type things in to find out what they say - I know enough key letters that I can figure out most words after a little effort. I even was able to recognise a typo - I realised that "hfvas" couldn't possibly be correct and fixed it. Some words I already recognise directly: "be" and "or" swap places in rot13, "gur" is "the" (I can also recognise a bunch of other words starting in "gu", such as "gung", "guvf", "guna", "gurer", "guvat"). I can also recognise "vat" as a suffix ("ing") and I know the letter substitutions for, among other things, all five vowels. I expect that within a week of working in this mode I'll be able to read rot13 pretty easily.

Someone on k5 mentioned the polarizing-glasses approach also - the problem with that for my situation is that I'm on an extremely shoestring budget and don't really have the option of going out and buying a special screen filter and special glasses. Plus I expect I'd look a bit silly wearing 3D goggles on the train ;)

The one risk I'm taking using a standard encoding like rot13 is that someone will be on the train who already knows how to read it - I know that such people exist. I think that risk is miniscule enough (along with the parallel risk of someone with a photographic memory and a strong enough desire to know what I'm doing that they'd type it into their computer later and decode it that way) that I'm not going to worry about it. In general on a given train journey there are one or two people in a position to read my screen even if they wanted to, and of those, probably 90+% don't want to. What are the chances of the one or two exceptions per week (if that) being among the tiny fraction of the Earth's population that can read rot13? I've wanted to learn to read rot13 for a while anyway - learning that gives me a (small) external benefit, where learning my own private encoding would just make it that much harder to learn rot13 separately.


I often find myself working on my laptop on the train, or in another public place. Sometimes I'd like to spend that time writing things that I'd prefer not to be read over my shoulder by random fellow travelers. It occurred to me that this goal could probably be attained with a utility that would rot13 all text on a tty (my laptop is text-mode-only due to being a 486 with 4Mb RAM :) ).

However, my searches for software to do this have come up empty so far. I'd even settle for a VIM mode that would accept keystrokes normally, but display everything rot13'd - everything I'm concerned about right now would be in VIM. Is it really true that nobody's come up with a "conceal what you're doing" rot13 program?

Chicago: You can do much better than the three-fold increase in filesize that your solution implies. Off the top of my head, I can see an immediate way to increase the filesize by just twofold-plus-a-small-constant with almost the same benefit as your solution. Since this is entirely thought up in a couple of minutes by someone who's not well-versed in error correction mathematics, I guarantee it's possible to do much better than my solution, too. So don't go ahead and implement my solution - rather take the existence of my solution as a heads-up that there are a lot of ways to improve, and read some literature on the subject to find out what the state of the art is. I'd be interested to know myself how close I was able to get to that!

First, represent a as 10 and b as 01. By using this technique, you can identify any single-bit error, but not correct it - if you get 11, you know that's wrong, but you can't tell whether it's supposed to be a or b. (Choosing 01 and 10 rather than 00 and 11 protects against the kind of error where the channel is dead and always reports zeros, or ones). After reading this full stream, the recipient will have a result that contains three states: definitely a, definitely b, or unknown (we deliberately ignore the possibility of getting errors in two consecutive bits; we can catch that case later).

Then, after transmitting your entire stream, send a checksum of the correct data. Use a well-known checksumming algorithm like md5 that is known to have good properties, rather than rolling your own, which almost certainly won't work as well. Transmit the checksum using the same 10 / 01 encoding as the rest of the content.

Here's how the recipient of the data can figure out the corrected data from the original stream in conjunction with the checksum: Assume each "definitely known" bit is correct, but try both values for each unknown bit (in both the data and the checksum). Calculate the checksum of the data and see if it matches the checksum that was sent. This algorithm is O(2^n) in the number of "unknowns" found, but that number is expected to be low. If the number is too high to calculate in a reasonable amount of time, the transmission can be flagged as corrupt and the user can try again. If all has gone well, though, there should be exactly one combination of values for the unknown bits that leads to the checksum matching. Voila, you've now got your correct sequence of 'a's and 'b's.

There are a couple of ways that this can fail to give a corrected stream, but even those cases can be identified. One is mentioned above - too many "unknowns". Another is if there were errors in two consecutive bits, turning a b into a "definite a". That situation will be flagged because no combination of values for the "unknowns" will result in a matching checksum. The other possibility is the astronomical possibility that an incorrect sequence of 'a's and 'b's would generate a correct checksum. With a good checksumming algorithm, that should be absolutely impossible without also producing a too-large number of errors (and hence "unknowns"), so the transfer would be flagged as corrupt anyway.

As an Englishman living in the US, I'm double-gutted today... I felt that if an England-US final was still possible by today, then it would be the most likely outcome. So much for that idea.

Sing along with me:

We're going home, we're going -- England's going home
We're going home, we're going home, we're going -- England's going home
We're going home, we're going home, we're going -- England's going home
We're going home, we're going home, we're going -- England's going home

Everyone seems to know the line
They say it every ti-i-ime
They just know
They're so sure
That this time we could really make it
We could be a big hit
But I know that we're shit
Cause I remember

Three lions on the shirt
Awful refereeing
Let the Hand of God
Send us home all seething

So many hopes so many dreams
But all those England tea-ea-eams
Couldn't break
With the theme:
Cause I still see those penalties missed
And the refs that we hissed
And how we were all pissed
At England losing

Three lions on the shirt
Another fricking shootout
Beckam lying hurt
Had to get the boot out

Yeah I know you believe
But don't be naive...

Three lions on the shirt
Two Brazilians scoring
When the final comes
I will still be snoring

We're going home, we're going home, we're going -- England's going home
[repeat ad infinitum]

25 Apr 2002 (updated 25 Apr 2002 at 18:16 UTC) »
bjf: See my past diary entry for an idea for a conceptually simple but not-yet-written application. At least, I've never been able to find an existing implementation...

This might also be interesting to habes because it sounds like something that Audacity might be able to implement extremely easily. Or be done using a plugin to Audacity, if such a mechanism exists. It's basically a problem in user-interface design with respect to sounds, and it seems like Audacity has the bulk of that problem solved.

[updated to split paragraphs and to make a link to the relevant past diary entry]


Forgot to grab my laptop on the way out of the door. That's two train-traveling hours of potential japitools hacking that turned into sitting around doing nothing.


dyork: Congratulations! My first son or daughter (uncooperative little thing wouldn't tell us which) is due June 4th. Been busily painting the room and assembling cribs, etc...

From the sad-that-this-is-so-rare department

Gateway supports my right to enjoy digital music legally. Isn't that nice of them? Anyone know if their computers are any good? I'd like to recommend an ethical company to friends who ask about buying computers, but I don't want to reccommend a crappy computer...

I guess I'll just keep doing my part by listening to the dude and the cow singing their song...


That seems to be everything I can think of to say right now. Huh.


Nothing much of note has happened with regard to free software, which is why I haven't posted any diary entries despite having advodiary to make it painless. Of course, "nothing to do with free software" doesn't mean nothing at all. I've been busy painting and assembling furniture for the baby's room, and with my "real" job. I'm not sure how appropriate that kind of discussion is for this forum; I know that I don't mind seeing it in other people's diaries, but I do tend to just skim over it. But it's more important to my life than japitools is, so I guess posting a little bit about it even though it's technically offtopic is okay.


I've been working mostly on revamping the first pass of japitools, which is the Japize java program. I've revamped the algorithms for iterating over the classes to ensure that the output is sorted in strict alphabetical order with an exception for java.lang.Object. I also rethought the commandline options and added the capability for Japize to dump it's output directly into a .gz file by using Java's GZIPOutputStream class. Finally, I wrote a perl wrapper called japize to (a) allow running the program as japize rather than as "java net.wuffies.japi.Japize", and (b) check the program's arguments for invalid syntax early, so that you don't have to wait for a JVM to start up just to get told that you misspelled "packages".

In the process I discovered a few issues in the jode.bytecode library that Japize relies on for loading classes from zipfiles. I mailed the Jode author about them, and it turns out that the latest Jode in CVS already solves all my problems. Although this code hasn't been publically released yet, I think I'll definitely be using it once it is: it will allow me to vastly simplify several places where I'm currently fighting against the Jode API rather than working with it.

japitools - mind-numbing details

I just need to make a few more tweaks to Japize, and then it will be time to move on to pass 2: the japicompat perl script that tests APIs for compatibility with each other. Aside from making this use a constant and bounded amount of memory (instead of O(size-of-the-JDK-measured-in-methods), which is pretty huge and growing exponentially every release, it seems) I want to add some more options, like comparing bidirectionally instead of unidirectionally, more generalized filtering of the output, and machine-readable error output instead of human-readable. This last change is in order to support the new planned third pass - japifilter.

japifilter will perform the part of the japicompat process that couldn't be done as part of the second pass without using O(N) memory, optionally translate the errors to humanly-readable, and add new options like the ability to exclude errors that appear in another file. That will enable me to say "Show all errors between kaffe and jdk1.1 except for those that also appear between jdk1.2 and jdk1.1" - so I can evaluate kaffe's progress to full 1.1 compatibility without false positives for parts of kaffe that are already jdk1.2 compatible instead.

27 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!