13 Sep 2003 thayer   » (Observer)

Gzip Appending.

So, dear diary, I found something really cool the other day. Apparently gzip supports appending multiple files to a single file. Eg. The following actually works:

$ gzip < log1 > biglog.gz
$ gzip < log2 >> biglog.gz
$ gunzip < biglog.gz

The result is just like a "cat log1 log2". In fact, you can even tail a gzipped log:

$ foo | gzip >> biglog.gz
$ tail -f biglog.gz | gunzip &
$ bar | gzip >> biglog.gz

I haven't had time to look at the gzip code, but it has great implications for the sorts of things I have to work with. For example, I may be able to avoid uncompressing things to work with them, then recompressing. Ie. I'll need a lot less disk. Considering that we have the weblogs for ny.com back to 1994 that's some serious data to cope with. And that's nothing compared to the Terabytes a week at work (Inktomi/Yahoo).

So I'm not sure how I'll use this, but it sure is cool, and surprising that I've never noticed it before. Furthermore, a quick survey of my co-workers and friends revealed that noone has seen this before...

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!