Older blog entries for raph (starting at number 221)

I just got back from a one-day Artifex staff meeting in Las Vegas. Tonight's entry will be light.

Thanks to everyone for the positive feedback on my diaries. I am having fun writing them. Similarly, I hope David's self-proclaimed thin patch is shallow and brief. I particularly enjoy the interplay between his blog and mine.

gary: your work on mod_virgule and Advogato is very much appreciated, but please don't feel that you have an obligation. If your hands still need to heal, give that priority. But of course if you do some work, I will be happy.


Mamading Ceesay wrote me a nice email calling me on my inability to find references from Psyco. It turns out that Psyco is inspired by the Fabius compiler. I'm not directly familiar with this work, but I do know Peter Lee from my ML days. He's done some ML performance work, including a rather interesting Web server written in ML. It's good that Armin is reading the literature.

A couple more haiku

AFPL: free for use and change / but if you charge money / we get a fair cut
SCSL: we give you the source / but we control the process / for better or for worse

(I slightly prefer these versions over the strict 5/7/5 variants)


Most of the evening was spent in the ER, because Max had burned himself with a pair of tongs. As far as we can tell, he stuck them in the flame on the stove, then touched them to his cheek. We spent over two hours waiting, then the doctor looked at him for about 10 seconds. In any case, it was only a partial thickness burn, so with some antibiotic cream it should heal up nicely.

Stamp idea from Bram

Bram gave me a really nice gift this afternoon: an optimization for the bandwidth and storage consumed by stamps. Basically, his idea starts with stamps of value 2^k, then lets them get split in half up to k times.

In my previous thinking, a stamp contains some random bytes, an issuer id, connection info for the issuer (ip address and port), and an expiration time. When a node creates a new stamp, it uses /dev/random or whatever to make the bytes. Then it stores the stamp in its own database, and "issues" it to a peer, who may then trade it around the net. Finally, it gets sent back to the issuer for "redeeming". At that point, the issuer checks to see that the bytes are in the database (and haven't been redeemed already by someone else).

Bram's idea is simple (one of those "why didn't I think of that myself" ones). Stamp entropy is the root of a k-high tree. For each node in the tree with entropy k, the left child is AES_x(0), and the right child is AES_x(1). Now, instead of an opaque random block, you have a tree id (can be random; is shared by all nodes in the tree), a path (sequence of 0 and 1 bits to choose left and right nodes, respectively), and the result of the hash.

It's a nice technique, and it'll almost certainly be in the prototype when I finally release it. But what makes it a nice gift is that Bram obviously understands what stamps are all about and has been thinking about them.

Haiku licensing

Aaron Swartz has beautiful haiku explaining free software licenses (there's another one linked, but it's not relevant to free software):

MIT: take my code with you / and do whatever you want / but please don't blame me
LGPL: you can copy this / but make modified versions / free in source code form
GPL: if you use this code / you and your children's children / must make your source free

Aaron has placed these in the public domain. In other words:

I could even lie / and say I wrote it myself / the author cares not

Big companies

Somebody (I don't remember who, they've scrolled off the recentlog) said "that large corporations and free software are interested in" me. Sure, I can't help the fact that my altruism benefits these kinds of entities in addition to real people, but the fact doesn't excite me. When dealing with business entities, I prefer a business relationship; I provide them with something of value, they pay me money.

Free software licenses don't promote this goal, but dual-licensed GPL libraries are consistent with it. I recommend the approach highly, and plan to do most of my future work under it.

Graphs are not a precise model for software food chains

David McCusker writes:

Another exception related to performance occurs with caching. A reverse proxy server depends on an origin web server for content, and yet can serve that content faster than the origin web site when either of two things occurs. First, the proxy might be on a better network than the origin web site. And second, the proxy might have content cached most directly in the form for serving.

Don't get me wrong, I'm not overly attached to my theory, but I'm not sure this counts as an exception. Yes, the proxy depends on the origin server for content, but it might also make sense to say that the origin server depends on the proxy for efficient distribution of that content. Here, I don't think a graph edge says enough about the relationship.

Fault-tolerant techniques like RAID are a clear example, though. Here, the dependency relationship is a clean edge. The disk is clearly at a lower level than the RAID cluster, yet disk failure is hidden. Disk bandwidth has a linear relationship with bandwidth at the higher level, but the cluster has better bandwidth than the individual disk. When it comes to latency, though, everything is hard-limited by the disk.

There's another aspect in which simple graphs aren't a powerful enough model. A simple Unix utility may have only two dependencies: to a kernel, and to a C implementation. However, the kernel could be Linux or BSD, and the compiler could be (say) gcc or lcc. So we have four edges, but they're not all the same. The real structure is two groups of two.

An analogous situation occurs in food webs. To a first approximation, food sources are interchangeable. But again, there is structure. Humans need both starch and protein. It doesn't matter whether the starch is wheat or rice, but a diet of starch and meat is considerably more complete than one of two starches.

So a graph is an interesting starting place, I think, but it is an oversimplification. It would be very interesting to see whether people trying to apply network theory to other domains are developing more sophisticated models. I'm not aware of any.

Psyco vs Parrot

I feel that my last entry was a bit unfair. After all, Psyco and Parrot are both speculative projects with the potential to vastly improve the performance of dynamic scripting langauges, but also with a significant risk of failure.

Even so, my gut feeling is that Parrot has a solid chance of success, while Psyco is much more speculative. Parrot is based on fairly conservative ideas. There's a lot of literature on virtual machines, and the Parrot people seem fluent in it. By contrast, the Psyco webpage and distribution contain absolutely no references to any related work (that I could find). Maybe it's just my academic background, but it strikes me as a red flag.

In any case, I wish both the Parrot and Psyco teams huge success. Having to choose between multiple high-performance dynamic languages would be a good problem to have.


Thanks to David McCusker for carte blanche to beat him up. As he no doubt knows, it's not exactly my style, though.

Of course it's not useful to decide whether something is "necessary" on the basis of computational equivalence. S and K are necessary, the rest is a matter of notational convenience. (I is for weenies).

So let me try to state the question in more precise terms:

Is it possible to use a much simpler technique to accomplish nearly the same results, nearly as well?

Applied to the Mithril and SEDA/Java runtimes, the answer depends on how you count. The Java runtime is a horrifically complex beast. If you included it, then I would agree that Mithril is much simpler. But it's also more or less a solved problem (less, admittedly, if you restrict yourself to free software). If you count only the SEDA part, it's quite a bit simpler, because you have the GC daddy taking care of cleaning up all the memory.

I'm not arguing one way or another as to whether you should include the Java runtime. I think it's a good question.

The food chain

Now for some more general musings. The above emphasizes that software systems today are not isolated artifacts, but exist as part of a larger network. Dan Sugalski uses the term "food chain" to refer to this network in a recent interview. Coincidentally, Barabasi in his book "Linked" finds that food chain networks and the Web have similar scale-free topology.

Like the food chain, the software project network has one-way links. Mozilla depends on C++, but C++ does not depend on Mozilla. (Of course, as in real food chains, all kinds of complex interactions can occur once you start looking at second-order effects). I think it's worth looking at the properties of these links in more detail. What follows is a rough cut. I'm sure I didn't think of everything.

In an "A depends on B" relationship, B is always more focussed on performance, robustness, and stability than A. If B is lacking in any of these, then the combined A+B system is also. (Fault tolerance would seem to be the exception that proves the rule.) At the lower levels of the food chain, you see an almost fanatical attention to these issues. CPU's come to mind, especially. The price of a 2.53GHz Pentium 4 is currently more than double that of a 2.2GHz one, for a 15% speed improvement at best. Yet, for many applications, performance is measured in terms of order of magnitude.

Compilers, runtimes, and OS kernels all similarly have an intense focus on performance and stability. Of course, the need for these things varies. Windows 98 is woefully inadequate to run, say, an Oracle database, yet is considered a perfectly fine platform for gaming.

The higher levels of the food chain start to be concerned with user needs, which are famously complex. If you had to meet these needs and optimize performance to the bone, the problem becomes intractable. That's why the food chain exists. Filesystems are a lot closer to user needs than raw disk. Thus, applications rely on the filesystem to map performance and robustness reasonably well, and save a huge amount of complexity. Once this relationship is established, then quantitative work at the lower level (improving caching algorithms, say) has a positive effect on the entire system. Note, however, the downward pressure when the next level up is a database.

Just as the food chain is critical for the health of an ecosystem, I believe the software project network is critical for the health of a software community. We are blessed with many good things at the lower levels: the Linux kernel, gcc, etc. They're not perfect, of course, but they don't prevent you from doing good work at the higher levels either.

A good example of dysfunction, I think, is X during the late '80s and early-to-mid '90s. Here, the lowest level (the X protocol and Xlib) was reasonably good, but the next levels up (Xt and Motif) were awful. As a result, X failed to get attention at the lower levels too, and lots of things that should have happened didn't (better font handling, more advanced imaging model). Now, there are numerous healthy projects situated at the next level up from X (Gtk+ and Qt are the most popular), and we see X getting good quantitative improvement now too.

Now I'll make some people mad. In language communities, these relationships are especially important. Everything written in a language depends critically on the performance and stability of that languages's implementation. Thus, there is an important role for a project to do hardcore quantitative performance work, combined with a process that ensures stability. If that role isn't strongly filled, I think the language community is in trouble. And here is what I see: in Python, this role isn't really being filled, but in Perl 6 it looks like it will be, by Parrot, the subject of the Sugalski interview above.

That said, the dependence on the language implementation for performance may not be as critical in the Python community as it is in others, because the CPython+C amalgam is a reasonable approach in many cases. But if Perl 6 and Parrot succeed, it will be appealing to write high performance systems entirely in the high level language. And there are some things happening in Python performance land as well.

Food for thought, in any case.

20 Jun 2002 (updated 20 Jun 2002 at 06:32 UTC) »
bgeiger: yes, expiring certs after a year was always part of the plan, I just haven't gotten around to implementing it yet.

Joel on Software has a nice piece on how large corporations are adopting free software. Perhaps esr's dream is becoming realized after all, just a bit more slowly than he predicted. In any case, I don't care much about big corps and free software. The lifeblood of the movement is still individual, human people, and will probably remain so for quite some time.

The subtext of yesterday's async musing was to express skepticism that all the city/fiber/dock/boat mechanism in Mithril was actually needed. But, of course, not having seen the design in detail, I can't really criticize it.

Async runtime musings

Matt Welsh's SEDA is very, very interesting. I especially like the quantitative flavor. Often, that's what makes the difference between expressing one's taste and doing real science.

Matt chose Java for his implementation. On the surface, this is a strange choice. For one, until quite recently, Java didn't even have a nonblocking I/O library. Matt had to implement his own, using JNI. The path of least resistance is to use a thread per connection, which has scaling problems as the number of connections grows large. Of course, it's possible to mitigate these scaling problems by throwing more hardware at the problem. Now why would Sun do a thing like that?

Anyway, SEDA is a fairly sophisticated runtime. Each stage is implemented with some number of threads. The number of threads is dynamically tuned (by "controllers") to optimize things like utilization and flow control. Many asynchronous designs have a much more straightforward mapping between threads and the tasklets (or "fibers", in McCusker's terminology). For example, a lot of people use select() to get one thread, one process. Alternatively, people who sell high-end hardware encourage the use of one thread per tasklet. But neither is truly satisfactory. A select()-based server will block on many important tasks, including file I/O, and is also sensitive to individual event handlers taking too long. It also won't get any gain from multiple CPU's.

SEDA solves this, but places very high demands on the runtime. In particular, you have to deal with very dynamic patterns of memory allocation. In a performance-oriented context, it's tempting to do this memory management (at least partly) by hand. But in an environment as dynamic as a typical SEDA server, it gets really tricky. Not to mention that malloc() and free() themselves start sucking when multithreaded (a microbenchmark shows roughly 3x slowdown with zero lock contention).

Similarly, reference counting would start to bite very hard too. In general, you need a mutex around each refcnt bump. No wonder Python has a global interpreter lock instead.

So the Java approach is actually very sensible. You simply require high performance multithreaded garbage collection from your lower level runtime. This is actually quite a Hard Problem; much PhD-level research has gone into it. However, by now it's more or less been solved. Since it's at the very lowest level, a large investment gets amortized over everything else at higher levels.

There's a pattern. Do sophisticated and complex stuff at the low level (here, hairy and nasty might be better adjectives) so that the higher levels become simpler.

SEDA isn't just interesting because of raw performance, either. From the looks of it, you actually do get good decoupling between the various stages. There's also some thought about debuggability, often a terribly hard problem in asynchronous systems. Perhaps it will become realistic to write an asynchronous library and be able to compose it with other components.

Thanks to Don Maas for an enlightening email about OpenGL display lists. As it turns out, these are not directly editable, but you can do splicing-style edits by composing them hierarchically. Not directly relevant, I think, but interesting.


David McCusker is going asynchronous on us again!

There are two pieces of the related work that strike me as quite interesting. First, Matt Welsh's SEDA work, which is quite appealing. In his Hotos paper, he makes a strong case that operating systems should not always provide a transparent virtual machine. Indeed, I can think of lots of ways in which global coupling could lead to better overall performance. One example that comes readily to mind is coordination of caches. In multiple processes on the same machine, many megabytes may be allocated in an unused cache, while another highly active cache could benefit from the space.

Reading about the specific webserver application also gave more insight into why the lock contention here was causing such long waits. When the pool of Apache processes is exhausted, the socket doesn't accept new connections. These connections are retried using the usual TCP exponential backoff algorithms. So delays could be in the dozens of seconds, even after the queue is mitigated server-side. All that should be fixed now.

Second, it looks like some serious quantitative design has gone into BEEP. The section where they talk about duplicating some of TCP's flow control caught David's eye, and mine also.

I conclude that TCP sockets are delicately tuned for performance, and that any abstraction which is placed over them carries the risk of destroying this tuning. CORBA is a particularly bad offender. First, it makes it too hard to do asynchrony. Second, it hides extremely useful information about the state of the TCP connection, particularly when it is closed, and when the window is full. If these properties are important, then protocols composed from CORBA will inherently suck compared with those built directly on top of TCP/IP. Remoting other notions of interprocess communication carries the same risks. Yet, in the local case, much simpler notions such as CSP-style rendezvous may be just as effective.

A couple of people pointed me at Twisted, and also libio by Niels Provos. Thanks.

Asynchrony is fascinating, but difficult to master. All real implementations face difficult design tradeoffs. Systematic approaches like CSP are interesting, but I am firmly convinced that there is no universal answer.

Fitz replies

kai: I'm unaware of the node design in OpenGL display lists, but they sound interesting. The OpenGL folks seem quite performance oriented. Do you by chance have a good reference?

For normal horizontally placed text, the default will be 1/4 pixel horizontally, and a whole pixel in the vertical direction. These defaults are based on fairly careful empirical study. 1/4 pixel horizontal positioning is nearly indistinguishable in quality from ultraprecise positioning, while 1/2 pixel is not quite enough. Vertically, alignment with the baseline helps rendering quality. The only real downside is that line spacing may become slightly irregular. I doubt this will be much of a problem in practice, though.

For vertically oriented text, I'll probably just switch the axes, but I'm not sure, as I haven't done any kind of careful study.

Finally, for rotated text (not aligned with either axis), I'll turn off caching altogether, because the hit rate is likely to be low.

As for progressive rendering, it's not high on my list of interests. I don't expect that the difference in performance between low quality and high quality modes will be enough to justify the complexity. And, as you suggest, it becomes less interesting as systems get faster.

mslicker writes: raph, I have to ask, what problem are you trying to solve?

This is absolutely the right question. I haven't written it down explicitly, and for that I apologize. The top-level problem is rendering a 2D graphics images provided by the client. At first, the main client will be Ghostscript, so the images will be PostScript and PDF files, but I expect interactive applications will use the library as well.

Then, there are several subgoals. Performance is critical. Memory utilization is also very important in some applications (especially embedded printers, which have very limited hardware). Even on desktops, using less memory is usually a big win, because of memory hierarchy effects. Quality of rendering is critical, but doesn't directly affect the tree API.

Given these constraints, I'd also like the client API to be clean and simple, even when the implementation does sophisticated things under the hood. I want to do a extensive unit and regression testing, and I feel that Python is a good language for these, so Python bindings are also important.

So why am I arriving at a general tree infrastructure? For one, the imaging model includes PDF 1.4 transparency, which has a very strong tree structure. Optimized rendering requires analysis of the tree. The painter's algorithm won't work as well.

And the simplest tree implementation, one in-memory object per node, is far too expensive in memory utilization when the tree is very large, or the available memory is limited. Thus, it's important to store the tree in a concise format, competitive with Ghostscript's existing "command list" at the least. I can easily envision multiple tree implementations tuned for different tasks. The simple, in-memory one will be implemented first anyway, because of its simplicity. I am not a fan of gratuitous virtualization, but when you have two different implementations of the same conceptual interfact, it's often a good idea.

At this point, I'm exploring the design space. This exploration is informed by thinking about more general problems, lots of them from my DOM experience. When a general approach is good, I'll use it. However, when generality forces unpleasant tradeoffs, within Fitz it will be a casualty. I am spending time thinking about more general trees so I know which is which.

I haven't written up Fitz in any real detail yet. In some ways, it's unfair to be blogging it, because you see the raw thinking without needed context. When I get to start work on Fitz in earnest, I will do much writing, and it will all be very accessible. But I have to concentrate on several hard problems for the upcoming Ghostscript 8.0 release first.

Change notification in trees

This is another fairly dense technical piece. If you were hoping for diversion or scandal, sorry. It is a companion to my caching in trees piece from a few days ago.

For interactive displays, the Fitz display tree will follow a Model/View pattern. The model is the tree, and the view is a rendering module that renders the tree and displays it on the screen.

What happens when you change the tree? If you wanted the simplest possible implementation and didn't care about performance, you'd re-render the tree from scratch, and display that. But I do care about performance, and I am willing to sacrifice some of that simplicity for "minimal updating". At root, this is the realization that, if you have the rendering for the old tree in hand, then to get the rendering for the new tree, you may only need to do a small incremental computation.

Let's look at minimal update in more detail. A deep optimization running through much of Fitz is culling based on bounding boxes. Changes to an object are guaranteed not to affect rendered pixels outside the bounding box (note that effects like drop shadows don't fit neatly into this category; thus, they're not directly supported in the Fitz imaging model). So, if you change an object, a reasonable approximation to the update area is the union of the old and new bounding boxes. If you delete or insert an object, just the one bounding box.

However, for some objects you may be able to do better. For example, when dragging one corner of a semitransparent rectangle, the minimal repaint area may be L-shaped. In this case, repainting the entire rectangle would be wasteful.

One more performance issue: if the tree is very large, then the cost of having one "event listener" per node could be too much. Keep in mind that the only justification for all this is performance. If you add complexity and make other aspects of performance (like memory usage) worse, then it's probably not a win.

So the broad outlines of a change notification system begin to suggest themselves. If the tree is small, then having an event listener on each node is reasonable. Changes to leaf nodes are distinguished from changes to tree structure. The former are dispatched first to a change handler for the specific leaf type. The fallback is the bounding box approach. Changes to tree structure also invoke the bounding box method.

However, as the tree grows larger, you only attach event listeners to some nodes. It's a type of cache. For nodes that have no event listeners attached to descendants, you expect a notification any time a descendant changes. Otherwise, you only care about changes to the node itself, because you know other listeners will handle the rest.

So now we have the concept of event propagation. For at least some kinds of listeners, you want to bubble the event up the tree until you hit a listener. In the extreme case, you can have just one listener at the root of the tree.

Now I face a difficult design decision. If I want to have multiple Views of the same Model, event propagation gets trickier. In particular, the propagation patterns for different Views can't interfere with each other. You can't simply stop bubbling an event upwards when you hit a listener, because maybe another View has their listener further up. Bubbling all the way to the top isn't very satisfying either, because that will tickle the listener at the root even if a deeper listener has already handled the change.

There are a bunch of other hacks that don't work so well either. You could, for example, check to see whether a deeper handler has already been called, and ignore the notification if so. But you get into trouble if you want to aggregate several changes together into one notification. Keep in mind that if you change most of the nodes in the tree, then re-rendering from scratch is almost certainly better than doing an incremental computation for each node changed.

If you did have a good solution for the one-View case, then you could extend that to the multi-View case by having an "event group" for each listener. When a listener catches an event, it sets a bit so that other listeners with the same event group farther up the tree don't get called. But the complexity is unsatisfying.

That said, let me present one reasonably simple approach supporting multiple Views. Here, the tree implementation does just bubble events all the way to the root. It's up to the client (the View) to apply the event propagation logic. At an internal node, if you have listeners for all children, and if the node itself is unchanged (meaning no insertions or deletions at that node), then you can ignore the notification.

However, there is one unappealing aspect of this approach. If you want to add more listeners (say, after they've been thrown out of the cache), then it only makes sense to add listeners to all children of a node. If you only add it to some, you can't reliably ignore the notification, so you might as well not add it to any. I expect that the fanout may be quite large, so this could be a problem.

So far, I haven't been able to come up with what feels like the Right Answer, only different tradeoffs. Right now, restricting the thing to one View seems like the best tradeoff. For Fitz, I think this will very much be the common case. But for other, more general tree applications (particularly those involving networking), it's probably not good enough. It would be nice to have a general solution that was simple enough to be practical for Fitz, but I guess I'm just too incompetent to come up with one. Ah well, such is life.

15 Jun 2002 (updated 15 Jun 2002 at 07:18 UTC) »

I finally got around to fixing the locks so that lock contention won't cause huge delays in reading pages. Writing (updating diaries and the like) can still be affected, but this is less urgent to fix.

LotR wrote:

We need a diary-writing trust metric!

Okay. I think you're right. I might well be motivated to write a generic metadata engine and apply it to the specific application of "how interesting is diary X?". Here's roughly how it will work.

When you're logged in, you'll get a chance to enter a one-to-ten score for another user's diary page. I might put this right under the "Certify <user> as:" selection at the bottom of individual person pages, but I'm also inclined to make it more accessible, for example allowing bulk updates on a customized version of the recentlog page.

This goes into the database as generalized assertions. At first, the only assertions that will be allowed are of the form "<user>'s diary is 7 on a one-to-ten scale", but the engine doesn't care what kind of assertions are present. "Roquefort is a particularly fine cheese" is also plausible. The reason for limiting the assertion space is to avoid scaling problems, which can become quite severe as the number of assertions scales up.

Then, roughly nightly, there will be a process that computes metadata scores, using the method I presented in my HOWTO. This will compute a confidence value for each user in the trust graph and each assertion. You can see where the scaling problems come from. I am sure there exist techniques for storing this data more sparsely, but I'm not interested in doing that research now.

Finally, the recentlog display will be annotated with the metadata scores. I'll probably also put in an threshold option.


I am trying my best to be patient with bytesplit. I realize he is a human being like all of us, but for whatever reason driven by demons causing him to antagonize people here. I sincerely wish that he is able to tame these demons, and interact positively with Advogato.

At the same time, I realize this is unlikely. As such, bytesplit is providing an opportunity to look at the trust metrics and the dynamics of this site more critically. The current trust metric certainly has limitations, and is definitely not a magic bullet for making this site an interesting read and a comfortable place. That's up to us.

What the trust metric does do is automatically compute membership in the community based on peer certifications. While I personally feel that bytesplit's contributions to free software are marginal at best, ten people here feel that his level of interest is high enough to rate an Apprentice cert. And, he does show interest in learning more, and his on-topic writings are perfectly reasonable for an aspiring apprentice. Given that, I don't think the trust metric should reject bytesplit's ranking.

All this is good motivation to implement the generalized metadata as proposed above. Unlike the existing trust metric, this metadata system would directly address quality and relevance of writing. I'll be very interested to see how it goes.

Cert inflation

We definitely have cert inflation here. Part of that is because the trust metric is generous, part of it is that people here are generally doing an inaccurate job of evaluating peer cert levels. This is useful information for people trying to design metadata systems: a significant fraction of the information input will simply be wrong.

I could certainly make the trust metric less generous. The easiest way to do this would be to have negative certifications as well as positive ones. But I'm not convinced that cert inflation is the most important problem in the world to solve.


David McCusker called again, and we had another nice chat, this time focussing on writing programs in asynchronous style. I think it's a hard problem. I think it's even worse for library writers, because it may not be realistic to assume that most users of your library will understand asynchronous programming very well. I told David of X as a cautionary tale. X actually has very sophisticated logic for dealing with asynchrony properly. For newcomers to X, this all seems very intimidating and complex (asynchronous grabs are a good case in point). In fact, I think there is widespread failure in levels above X to deal with race conditions and the like correctly.

Every time you do something over the network, it's asynchronous whether you like it or not. Yet, event-driven programs seem a lot more complex than their simple, synchronous cousins. David would like to recapture that simplicity in asynchronous programs. A lot of other people have tried things in this direction, without very happy results so far. I feel that CORBA is a cautionary tale in this regard. It pretends that method calls are really local, when in reality they're decomposed into two asynchronous events, and of course all kinds of things can happen in the meantime.

I haven't seen any of the details of Mithril yet, but I'm fairly skeptical that it will make asynchronous programming accessible to less-skilled programmers. On the other hand, I am perfectly willing to believe that it will be a good tool for expressing asynchrony concisely, and thus useful for people who know what they're doing.

One detail we touched on but didn't really go into was whether the fundamental message sending operation on channels should be synchronous (as in CSP) or asynchronous. In CSP, if you send a message on a channel, but there is nobody ready to readon the channel, you block. The other way to do it is to append the message to a queue. Both are reasonable primitives, in that it's quite straightforward to simulate one in terms of the other. So which do you choose?

I mentioned that the CSP way might be easier to reason about. There's another issue that came to mind after our call: the queue required for the fully asynchronous case requires unbounded resources in the general case. Obviously, in tiny embedded systems, this can be a real problem. On desktops, it's less clear. But if a system is operating under very high load, you probably want to worry about whether the queues will keep growing. Of course you can always implement flow control on top of async messages, but that's not really the point. On CSP, the default is not to grow unboundedly.

mwh: I haven't been following Stackless Python closely, but I am aware of it. Looking briefly at the site, I see they are now implementing a concurrency and channel approach directly inspired by Limbo and CSP. That could be very cool.

Farmers Market

We went to the Farmers Market in Benicia, as we do most Thursdays. This was the first week of corn season, so we got some. It was amazingly good. I like farmers markets; they're a good way to make a connection to the people who actually grow your food.


Antony Courtney called me up today, and we had a nice chat. He is interested in working on Fitz, largely so he can use it for his thesis work on functional reactive user interfaces.

His work is done in Haskell, so Haskell bindings for Fitz would be part of the deal. I think these two things could work well together. Fitz has a somewhat functional design. The rendering of a tree is a function of the tree. Caching is strongly related to memoization in functional languages, and minimal updating is a form of incremental computation.

Language bindings are to be an integral part of Fitz. We'll use the Python bindings extensively for testing. It's also, I think, the best language for experimenting. But I don't mean to exclude Perl or Ruby programmers, either. I just wish it was easier to write high quality cross-language wrappers.

What I'd really like is something like Pyrex, only with the ability to generate code for many different runtimes. Pyrex itself seems to be coming along quite nicely. As of 0.3, you can define classes in C, which seems quite useful. At some point, I might be motivated to try an experiment of making Pyrex generate, say, a Ruby extension.


David McCusker writes briefly of picoservers. These sound like fun. Basically, the problem boils down to: how do you best express asynchronous behavior in a programming language? Threads are one way, but they have lots of pitfalls, including performance and scaling issues. Event-based programming is more lightweight, but has a reputation for being very tedious and low-level. Also, event-based programming by itself can't take advantage of multiple processors.

I haven't looked at SEDA carefully yet (it's the thesis work of Matt Welsh), but it looks interesting.

People have been thinking about asynchrony for a long time. One of the more elegant approaches is Hoare's Communicating Sequential Processes (CSP). I'm a bit surprised that CSP hasn't gone further. It seems like a nice higher level abstraction compared with event-driven programming, but without all the nasty problems with race conditions and lock contention that threads bring you.

There is actually a CSP implementation in C. It seems like more typing than languages that have CSP baked-in. Occam is the most famous of these languages, but I think Limbo might be a more useful incarnation. Occam tends to be fairly static, but Limbo lets you create "threads" and channels very dynamically.

Python's generators are already sorta like coroutines. David Mertz talks about using them to implement what he calls weightless threads.

212 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!