Older blog entries for emk (starting at number 16)

XML-RPC Acceleration: Spent this morning hacking mod_gzip to to understand "deflate" compression. If you drop this hack into your webserver (and use a smart XML-RPC client), you'll probably cut your outbound XML-RPC network traffic by a factor of 10.

It's very experimental--I can make it dump core--but it's the start of something moderately nifty. Combine this with boxcarring, and you're beginning to get some decent scalabity and throughput.

Binary Data: I also spoke with several of Flight Gear developers (all very cool folks), and discussed the possibility of XML-RPC without any XML. Basically, a client and server could negotiate away the XML layer, and transmit raw binary data structures to each other. You'd keep all the fun features of XML-RPC (the introspection, the dynamic data, the 750-line clients), but get enterprise-grade RPC when you really needed it.

The hard part of this would be the design, not the implementation. How do you hide the funky new features from the less-advanced clients, and how do you activate the new protocol?

Introspection: I wrote a script called xml-rpc-api2txt. Give it the URL of a server, and it will print out a nicely-formatted interface specification, complete with documentation. I fixed it to play nicely with Meerkat, too--O'Reilly was preformatting some of their documentation strings, which was messing up Perl's formatting commands.

Now, who wants to hack this script to automatically generate C++ and Java classes for a given server? :-)

Community: Yikes! Things are really starting to move. I got piles of e-mail today, half of which contained patches and the other half of which contained great ideas.

SourceForge: Is full of bugs.

XML-RPC Problems: Spoke with Adrian about RedHat's experience with XML-RPC. It seems that they've run into a very big problem: round-trip HTTP message latency. They tried to make lots of little XML-RPC calls, and their performance died miserably.

I did some research on HTTP pipelining (which xmlrpc-c already supports), but this doesn't fix the problem. It seems that HTTP always requires a minimum of one round-trip packet per request. So if you've got a 250ms ping time, you can't make more than two XML-RPC method calls per second.

This is totally unacceptable. Oh, sure, you can work around it by reducing your number of function calls to a minimum (which is what the RedHat Network does), but your APIs will still be bletcherous.

A Proposed Fix: Let's say you're invoking a simple-minded addition function:

>>> import xmlrpclib
>>> multi = xmlrpclib.Server("http://localhost/cgi-bin/multi-cgi.cgi")
>>> multi.sample.add(1, 1)
2

Now, if you need to perform a zillion additions, this is really going to suck. But since XML-RPC is so dynamic, there's no reason why you can't just do something like this:

>>> add_2_2 = {'methodName': 'sample.add', 'params': [2, 2]}
>>> add_4_4 = {'methodName': 'sample.add', 'params': [4, 4]}
>>> multi.system.multicall([add_2_2, add_4_4])
[[4], [8]]

On the back end, my new server code unpacks this transparently, calls all the appropriate handlers, and packages up the result.

If you're a serious RPC hacker, I'd really appreciate some feedback.

On related note, Ryan Dietrich is hacking on an asynchronous XML-RPC message queue. Way cool!

XML-RPC: The XML-RPC HOWTO has now become an official part of the Linux Documentation Project! You can find it in the usual place. The PDF and PostScript versions are still broken. This is being investigated.

(You can tell from my bubbly enthusiasm that this is my first submission to the LDP.)

I want to follow RedHat's lead on XML-RPC/SSL, but I need some good crypto advice.

XML-RPC: Hack, hack, hack.

I've added support for HTTP Basic Authentication to xmlrpc-c. This required refactoring the client library to use server objects (instead of just URLs). This enhancement turned into a nightmare debugging job, because the following software sucks more than strictly necessary:

  • PHP: Every time I write a non-trivial PHP program, I spend 15 minutes wondering why some variable is unset when it should contain data. Then I realize that I've left out a global declaration, and PHP has silently created an empty local variable shadowing some global.

  • w3c-libwww: Libwww has this weird, asynchronous callback model, and the API isn't especially well documented. Once you define a callback, libwww feels entitled to call it at the strangest of times--during some failed network operations (but not others), during library shutdown, and whenever you look at it cross-eyed. And sometimes libwww will pass you a "200 OK" when it hasn't even made a network connection, and your chunk pointer will mysteriously be NULL. So my code is now laced with incredibly paranoid assertions, and basically trusts nothing returned by the library.

And let's not even ask which festering ball of bugs was truncating five bytes off the bottom of every XML document. But at least everything now works, and passes my evil test suites once again.

Jade: While I'm ranting about broken tools, I should mention Jade (and JadeTex, and TeX, and all the other hairy DocBook tools). This stuff is impossible to set up. It's badly broken in RedHat 6.2. It requires you to edit TeX *.ini files.

If you're not careful, you'll start mumbling the following nonsense in your sleep:

pdftex -ini -progname=pdfjadetex pdflatex.ini
pdftex -ini -progname=pdfjadetex \&pdflatex pdfjadetex.ini
# restore the original pdflatex.fmt file:
pdftex -ini -progname=pdflatex pdflatex.ini

(Thanks to Adam Di Carlo for figuring this out.)

It's just unbelievable how bad this stuff is. TeX allocates everything in fixed size buffers, and hard-codes the buffer sizes into all of its format files. Jade spews out ASCII NULLs, causing JadeTex to cough up a lung. TeX fmtutil wants to have a little talk with you, and find out where JadeTeX hid all the *.sty files. And of course, let's not forget the joy inherent in editing SGML CATALOG files to work around RedHat typos.

But the resulting manuals are fairly pretty... :-)

XML-RPC: Zope exports a complete scripting API using XML-RPC. But to use it, your xmlrpc client library needs to support HTTP Basic Authentication (or maybe cookies, depending on how the server is configured).

xmlrpc-c doesn't support any of these things, yet. But let's see if I can figure out how to do them with w3c-libwww...

XML-RPC: Wrote up a little XML-RPC HOWTO, with lots of examples in a bajillion different languages. Added support for CGI-based XML-RPC servers, too, because ISPs typically hate daemons.

Here's a nice little Perl client from the HOWTO, written using Ken MacLeod's Frontier::RPC2 package:

use Frontier::Client;
$server_url = 'http://betty.userland.com/RPC2';
$server = Frontier::Client->new(url => $server_url);
$name = $server->call('examples.getStateName', 41);
print "$name\n";

Took a look at SOAP (a related protocol), and decided that the W3C overdesigns everything to the point of insanity. So for now, I'm going to take Larry Wall's(?) "bear of little brain" approach: If I can't figure it out in half an hour or so, it probably sucks.

School: Latin quiz tomorrow. Ick!

17 Jan 2001 (updated 17 Jan 2001 at 18:46 UTC) »
XML-RPC: Ho, hum. Lots of downloads, but almost no feedack. I added a C++ client API, and discovered all the wonderful new features which have been added to the language in the past few years. (Sarcasm intended.) But several major users like the C++ API, so that's cool.

I also wrote some scripts that build a complete, tested distribution in one step--tarballs, LSMs, RPMs, etc. This required hacking around with RPM's _topdir macro.

What's next? I dunno, since user feedback has been so skimpy. Either an Apache module, or a "Building a Client/Server Protocol in 15 Minutes Using XML-RPC HOWTO".

Car: I figured out why my car heater was broken. Get this: dodgy fan, dead heater core, chronic low pressure in the coolant system, dodgy thermostat, and a crack in the radiator cap gasket. Fixed most of it with off-the-shelf parts, because I don't want to pay for a rent-a-car while mine sits on some mechanic's lot. Other care and feeding included new wipers, window moisture repellent and a car wash.

Personal: It's time to restructure my schedule. This means daily exercise, plenty of solid productivity time, and time off in the evenings. I've done this for long periods of time before, and it makes me happy. I'm sick of erratic shcedules and conflicting responsibilities, so it's time to shape things up.

7 Jan 2001 (updated 7 Jan 2001 at 00:14 UTC) »
XML-RPC for C: Whooo! I just spent the holidays hacking on an open-source implementation of XML-RPC in portable C. You can find more information on the SourceForge page.

To make a long story short, XML-RPC packages up a procedure call as XML, sends it to a server using HTTP, and parses the response. It's mostly useful for gluing together distributed web applications, but it's also handy for cross-network scripting. There are XML-RPC implementations for Python, Perl, PHP, Zope, REBOL, LISP, Dylan, Visual Basic, and a bunch of other languages.

Thesis: Now that Christmas is over, it's time to start work on my thesis again. I've successfully modified the d2c compiler to build dispatch tables; now I just need to compress them and write about the whole experience. :-)

13 Sep 2000 (updated 6 Jan 2001 at 23:58 UTC) »

[ Ranting about the Client From Hell snipped. ]

12 Sep 2000 (updated 17 Jan 2001 at 18:58 UTC) »

CustomDNS: OK, it's time to finally fix the authentication system. The problem: the CustomDNS authenticator wants to use its own database tables, not my customer's. The solution: fix CustomDNS to load the relevant SQL queries from the configuration file.

*hack hack hack* Score! It runs! One more billable feature. :-)

Language hacking: I cornered the new professor last night, and asked him to act as faculty advisor for my project. We had a great talk about multimethod dispatch, efficient bounds-checking, whether scripting languages can be compiled, and other groovy stuff like that.

I hope I didn't scare the poor professor away. :-( I can be a bit too talkative, as several people have pointed out.

7 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!