Older blog entries for ingvar (starting at number 119)

New version of genhash (my generic hash table library, written to work, rather than to be fast) has been packaged and released.

Bill Clementson mentioned recently in his blog different methods of making sure one's wrists "survives emacs". He missed one method, though probably from the fact that it seems completely counter-intuitive. Saying that, it has seen me use emacs on a more-or-less daily basis for the last 14-or-so years. The method? "Go beyond touch-typing" My normal finger positioning (once trained in typewriter class, to the point where I could type reasonably fast with a screen between me and the keyboard) is fairly fluid, with my right (stronger) hand usually pressing two keys for each my left presses. It has some curious draw-backs, I am more likely to transpose characters, especially when they alternate between fingers and I do need to re-sync hands and keyboard at regular intervals. I also need to not have my hands, wrists or arms resting on anything. Apart from that, I use either a SE/FI keyboard layout (at home) or a UK layout (at work), with no swapping of Control or CapsLock (though when typing in X at home, CapsLock is disabled completely).

Part of it is, as mentioned there, choosing a good keyboard. My keyboard of choice is a buckle-spring keyboard, they make a bit of a racket, but I've noticed that I can type for somewhere between 4 and 16 hours (depending on how well fed and watered I am before going "into the zone") without noticeable wrist problems. On a "normal" PC keyboard, I need to take a short break every 45 minutes or so, otherwise my forearms start feeling as if they're about to cramp up.

Anyway, when it comes to ergonomics, if it feels good, it probably isn't bad. Experiment with various keyboards (layout, mechanical construction, positions, sizes...) and other things that are somewhat easily changed. Try to stay with a change for at least a week, since all change tends to feel "bad" and "odd" in the beginning. Above all, consult an expert in the area and don't rely on hearsay from the Internet (but do take inspiration from it and find new possibilities to ask said expert about).

22 Oct 2004 (updated 22 Oct 2004 at 23:28 UTC) »

OK, small (really minor) code change to teh netflow reader library. One type declaration (declaring the return value from a READ-BYTE call as an (UNSIGNED-BYTE 8)) seems to knock teh speed down to "comparable with C".

; Evaluation took:
;   16.45 seconds of real time
;   10.88 seconds of user run time
;   1.52 seconds of system run time
;   39,267,241,128 CPU cycles
;   [Run times include 0.59 seconds GC run time]

Same test file as before, so we're down about 0.56 s on plain "chug through the file".

So, the code for my netflow log reader has been posted to small-cl-src. Once I've spent some effort writing a small(ish) readme and possibly a few more support functions (I am considering one to format IPv6, for those who may or may not need it, as an example), I'll stick it in a tar ball and on the web, somewhere.

Since it's been strongly hinted by my elsewhere-readers, I shall give a quick description on the latest for-work CL code I've written (no code yet, getting things released have been punted to the managing directors). First, this is code written for work, at work, using work's equipment so all I can do is talk about it, for now. Second, the reason this code exists at all is that it was Needed.

So, we extract network data using netflow. This, for a decent-sized network, means lots of data (we're talking GB per hour lots, about 21 MB/s as things stand, but there will be more). We had, mid-last-week, three APIs to extract this. One in C, one in Perl, one in Python. Doing any sort of "massive data munging" in plain C is, at best, somewhat painful ()lack of decent built-in data structures being the main pain), so I had been hoping (indeed chose the specific netflow daemon we use, based on) the existing Python API was going to do the trick.

It manifestly did not. On the box where the data resides, it took on the order of 8 minutes just doing the most basic processing of 2 minutes worth of logs (essentially "extract one record" in a loop). The Perl API was somewhat better, on the same amount of data, it takes "only" (sorry, danb) 6 minutes, 49 seconds. The C API does that slurping in 14-15 sec, so is "fast enough".

My initial CL "slurp logs, present each entry as an object" took a whole whopping 33s, but a slight refinement first thing yesterday had this down to 17s. Today, I have been generating some information from the amassed data and while I find 14 minutes OK for processing 1h worth of data, I will see what I can do to speed things up. Possibly by adding a "re-use existing object" frob to the API. Most (but not all) of the time, I only need a given log entry for the fraction of a second it takes to extract what I want from it (from some other API design decisions, I need one per netblock to hang around).

I am hoping to be able to either contribute this back to the original writer of the daemon or at least put a tarball with the two most important files (the package definition and the actual code) somewhere.

OK, seems as if I've managed to whip up a hash table implementation that allows the user to register their own [{eq test}, {hash fn}] pairs (there's four pre-registered, for EQ, EQL, EQUAL and EQUALP). It should be ASDF-installable (package name is "genhash") and there's a CLiki page in the obvious location.

It's not been exhasutively tested, but it does, indeed, seem to work. If a hash/eq-test pair taht doesn't follow "(eq-test a b) => (= (hashfn a) (hashfn b))" things will most probably break in *interesting* ways.

The presentation seemed to go well, as far as I can tell. Lots of intelligent questions and some interesting side-tracks to pursue.

I find myself in the bizarre position of trying to convince myself of not writing CL code for work (production use). I have this binary log format we need to extract data out of. I have at hand Perl, Python and C APIs but Python is definitely too slow. C is fast, but it's not nearly as flexible writing the analysis tools we need in C (implement another generic hash-table? do explorative programming? in C? no thank you). As soon as I've got some speed data out of teh Python code, I'll bash up the equivalent in Perl and check (basically, the speed-test program opens the log, then reads and entry and updates a counter; repeat until end-of-log; print counter).

I'm testing on 2 minutes worth of log, this is around 43 MB of data. In C, just reading it and incrementing the counter takes about 14 s, says time(1). The python version takes 8m14s to process 2m worth of log. This is quite bad for things that we'd (ideally) would want to process in real-time.

OK, I need to whack things a bit before I can speed-test the Perl version. Aha! It's faster than Python, marginally. 6m49s for a 2m log. Still not fast enough.

Good thing I was moving ahead implementing something that may actually be useful, then.

"Ooops".

Seems as if I am persenting a small thing at next week's Scheme UK meeting. Admittedly, I'll be talking Common Lisp, not Scheme. I hope they will forgive me.

All code generation seems sane. This is a step forward. tar-ball downloadable (complete with ASDF system definition) from here (signed MD5-sums here).

Latest little piece of code is "internal use only" (it's a thinge for generating character templates for an RPG I am slowly causing to exist, the latter is not ready for general release and the former is *quite* useless without the latter; there's some possibly-interesting persistence happening, but taht may or may not be generally useable).

Well, seems as if I had managed to mess up quite badly, implementing the code generation for LD, so that's being rewritten. However, I'll probably not manage to finish it all this week (there's bound to be other "interesting" areas of the code generation, see), so I shall simply have to write a sufficiently-ambitious piece of test code (and also try to figure out if I want to do something clever with labels as argumnets to DJNZ and friends).

110 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!