18 Mar 2003 dajobe   » (Master)

Raptor and web libraries

Unlike in Java, Perl, Python and all those higher level languages, in C when you want to do something like retrieve a web page, there is a lot more to do. There aren't stdurl or stdweb libraries around that you can assume are always available. Since raptor is a parser for an XML language, libxml is one likely thing that is usable and it has a tiny HTTP implementation, sufficient for GET. There is the defacto portable web library libcURL and so I make that also configurable plus the W3C libwww which is common but rather large. So problem solved.

Or so I thought. It turns out that all those APIs except for the W3C libwww are push - they take the thread of control from the caller and return data to it via callbacks. However I wanted the more I/O stream-like pull i.e. the user application does while(...) { get stuff; do stuff }. You can wrap a push API around a pull one quite easily and efficiently, but not the other way around - you need to store all the pushed content then deliver it pull-by-pull. So, I'm going to have to live with that - provide both and warn users that the pull interface will suck up memory.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!