Linux or Mac
David McCusker wants
a development machine for home. I think he would be happy
a Linux box or a Mac running OS X. In the former case, he'll
spend some time dealing with Linux weirdness, in the latter,
have to spend some time dealing with OS X weirdness.
If the goal is raw price/performance, Wintel is a clear win.
the cheapest parts on Pricewatch is somewhere around 50% of
comparable-spec Mac. But Macs are arguably more elegant than
And, if stuff like having digital cameras Just Work when you
in is important, there is no contest.
When I ordered spectre,
horsepower was in fact a driving factor. I need to do
testing as quickly as possible. Otherwise, I would have been
tempted to get a Mac.
David also expresses the wish to do cross-platform GUI
Unfortunately, there's no good story here. Drawing is only a
part of the overall problem, so using something like OpenGL
much. I do expect the situation to improve over time.
probably closest, and has OSX support in development
What is the optimum grain size?
One of the most important factors affecting performance of a
access protocol is grain size. Here I'll present a
simplistic analysis. A lot of people never bother to do any.
How do you slice up a tree into grains? I propose to
tree, chop the serialization into fixed-size blocks (leaf
use a btree-like structure to tie this together. The cool
new idea is
to count parentheses in the btree nodes. This lets you fly
tree without having to dig into the leaves.
There are other ways to do it, of course. You can use one
node. You can use one grain for the whole tree, perfectly
if the tree is smallish. You can also try to aggregate
grains. On disk, the advantage of fixed-size blocks is good
utilization. On a network, you don't care about utilization
but you still might care about a large variance in grain
For simplicity, let's assume that the tree is read-only. We
analyze two usage patterns. The first is simply to traverse
tree. The second is to navigate to random nodes in sequence.
Traversing the tree means scanning the serialization
linearly. This is
cool. You only touch each block once. Assume that the time
to fetch a
block is a latency value, plus the size of the block divided
bandwidth. Total time for traversing the tree is (tree size
/ (bandwidth in bytes per second) + (tree size in bytes) *
(block size in bytes). It is easy to see that large blocks
The random node case is different. When you fetch a block,
small part of it is relevant. The rest is waste. The time to
node is latency + (block size in bytes) / bandwidth. This
blocks are more efficient.
What is the optimum grain size, then, for a mixture of both
When (block size) = latency * bandwidth, both individual
exactly a factor of two slower than their limiting best-case
(infinitely large blocks in in the case of a traversal,
small in the case of random nodes). Thus, the optimum will
be on the
order of latency * bandwith.
What is latency * bandwidth for real devices? Here's a quick
Don't worry about small inaccuracies. We're trying to get
the order of
magnitude right. This is just disk and network. Memory
important too, but the analysis is considerably different,
so I won't
do that tonight.
modern disk: latency 10ms, max bw 50M/s: 500K
wireless net 802.11b: latency 2.5ms, max bw 0.5M/s: 1.25K
modem: latency 50ms, max bw 4K/s: 0.2K
100Mb lan: latency 0.3ms, max bw 10M/s: 3K
dsl down from a nearby server: 20ms, 100K/s: 2K
dsl up to a nearby server: 20ms, 10K/s: 0.2K
dsl down international: 200ms, 100K/s: 20K
dsl up international: 200ms, 10K/s: 2K
In Unix, the traditional block size is 4K. It's interesting
value is not far from the mark for networks, even over a
range of performance. So the traditional block size is
Disk is the outlier. What's more, latency * bandwidth scales
roughly as the square root of areal density, and that's
mad. It used to be 4K, but that was a long time ago.
But 500K is two whole orders of magnitude bigger. If we
analysis, then access to a tree on disk will spend 99% of
seeking, and 1% accessing useful data. That would be bad.
The story is a bit more complex, though. Real disk-based
a huge amount of effort trying to increase locality,
clustering of items likely to be accessed in a cluster. If
is successful, then the effective grain size goes up.
prefetching is a particularly effective technique. Modern OS
implement prefetching, and so do drives. In fact, when you
4K block from a drive, it will usually spend on the order of
seeking to the right track, then 5ms waiting for the disk to
the right sector. Given a typical raw transfer rate of
means 250K or so of data will fly past the read head. In a
disk, all that goes into a cache. Then, when the kernel
blocks from that range, it gets them immediately.
So, to do a btree efficiently, you have (at least) two
could specify a really large block size, and not worry about
of blocks on the disk. Another method is to use a small
but try hard to cluster nearby blocks. This demands more
when allocating blocks within the file. It's also well known
filesystems that it's hard to avoid fragmentation when the
is very high. When there is ample free space, there are more
candidates to choose from to try to optimize locality.
Of course, in the read-only case, you can allocate the
in traversal order. In this case, 4K blocks are again
problem is avoiding fragmentation as you start updating the
David McCusker talks about this a bit in database
areas. In it, he suggests 2K blocks, but allocation logic
works in grains of 16K. That's still not good enough (by
more than an
order of magnitude) if the grains are randomly scattered in
Maybe he's doing something else to try to cluster grains;
clear to me. But I do believe it is a tricky problem.
The network version is in many ways simpler (this is a
in other important ways it is harder). You don't have to
locality, as there really is no such concept. The network
the same no matter which block you request. You also don't
worry as much about utilization, because it's possible to
sending unused byte ranges. Blocks can be variable in size,
unlike the fixed blocks of disks.
As I warned, this analysis is oversimplified. Yet, I think
useful to understand real performance problems. It gives a
insight into the appeal of flat files for mailboxes. At
50M/s, you can
grep an entire 250M mailbox in five seconds. A dumb database
by contrast, may use a hash table to allocate individual
effectively at random within the file. Thus, if you try to
through the messages in order, each read will take 15ms or
seconds will give you enough time to search 300 messages,
with the two orders of magnitude discrepancy between the
size of 4K and the optimum grain size of about 500K for
Thus, sophisticated file formats have a danger of creating
performance problems. But I consider that a quantitative
that yields well to careful analysis and design. To me,
those are the