17 Mar 2003 raph   » (Master)

Modular factoring

I haven't gotten much response to my last post on factoring codebases into smaller modules, but I have thought about the problem a bit more.

The first item is the desire to have a common runtime discipline that spans more than one module. The main problem here is that the C language doesn't nail down particular runtime decisions. In our case, the main things we need are memory allocation (one thing we need that standard C library malloc/free doesn't give us is a way to constrain total memory usage - for example, so that an allocation when near capacity causes items to be evicted from caches), exceptions, and extremely basic types such as string (C strings are inadequate because in many cases we do need embedded 0's), list, dict (hash table), and atom. A great many languages supply these as part of the language itself or as part of standard runtime, but C is not among them.

Of course, the fact that C doesn't nail down the runtime is in many ways a feature, not a bug. Different applications have different runtime needs, and a single general-purpose runtime is not always optimum. Perhaps more importantly, these richer runtimes tend not to be compatible with each other. In the case of Fitz, we need to bind it into Ghostscript (written in C with its own wonky runtime), Python test frameworks, and hopefully other applications written in a variety of high level languages.

In any case, with regard to the specific question of whether we're going to split our repository and tarballs into lots of small modules or one big one, for now I've decided to go for the latter, but with clear separation of the modules into subdirectories. That should preserve our ability to easily split into separate modules should that turn out to be a clear win, while making life easier for the hapless person just trying to compile the tarballs and get the software to run.

BitTorrent

BitTorrent absolutely rocks. Basically, it gives you a way to host large downloads (either large files, large numbers of downloaders, or both) without chewing up too much of your own bandwidth. Rather, downloaders share blocks with each other.

I think this has killer potential for Linux distributions and the like. I know most servers hosting RH 8.0 were seriously overloaded when that came out. I think that BitTorrent could be a far more effective way to get the ISO's out than standard HTTP/FTP. Of course, Red Hat probably won't push this, because much of their business model is founded on the relative slowness and inconvenience of public FTP servers as opposed to their pay service.

There's also a lot of potential into wiring BitTorrent into package downloaders such as apt and rpm. Some of the folks on #p2p-hackers think that WebRaid might be a better solution, but in any case I can see BT working well.

We're going to try distributing Ghostscript using BitTorrent, and see how it works.

These are legitimate (and very important) uses of BitTorrent, but it's most likely that the next big jump in popularity will come from other quarters. BitTorrent excels at serving up gigabyte-scale files with good performance and robustness, with minimal bandwidth and infrastructure needs. It shouldn't take a genius to figure out what this will get used for. The exciting (and scary) part is that Bram might soon find himself with millions of users.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!