Fitz
Tor and I have been working a bit more on the Fitz design. We have to
nail down a number of open decisions about coding style and the like.
We don't want to cut a lot of code, and then find we need to redo big
chunks. Our current draft is on the Wiki under CodingStyle.
Many of the decisions are somewhat arbitrary, but even so aesthetics
are important. We want to be able to look at the code with pride.
One of the most difficult issues is how to split up the code into
modules. What level of granularity is best? The Ghostscript codebase
tends to be fairly monolithic, and a large part of our goal is to
refactor it into independent modules.
Clearly, a full featured PDF app will use all the modules, but it's
also easy to imagine more lightweight clients that just use some of
them. Perhaps an instructive example is a PDF validity checking tool
(known as "preflight" in the graphic arts world). Such a tool has to
parse PDF files and process the PDF streams, but need not actually
construct a display tree for rendering.
One obvious approach is to make a giant hairball that contains
everything. Clients just link in the library, and use what they need.
There's little added complexity in the build and packaging processes,
and there's no chance that the individual pieces will get out of sync
with each other. However, it's not very elegant.
Another approach is to split everything into the smallest sensible
modules. An immediate problem is that many of the modules will want to
share infrastructure, particularly having to do with the runtime. For
example, one of the things we're hammering out is a "dynamic object"
protocol incorporating strings, lists, dicts (hashtables), names
(atoms), and numbers. These kinds of objects show up all the time in
PostScript and PDF documents, and are a handy way to pass parameters
around. If we parse such an object out of a PDF file, and want to pass
it as a parameter to the filter library, it would be really nice for
the type to match.
So, in the "many small libs" scenario, I think there would be one base
library ("magma") containing shared runtime infrastructure: at first,
just memory allocation, exception handling, and dynamic objects, but
possibly also loading of dynamic plug-ins and maybe threading
support. All the other modules will allocate their memory and throw
exceptions in a magma context, and pass around magma dynamic objects
as needed.
The filter library would be the first such other module. It's small,
very well defined, and will probably be quite stable once it's done.
The Fitz tree and rendering engine would probably be the biggest, and
see intense development over a period of time. Other obvious modules
include a low-level (syntactic) PDF parser, and a higher level module
that traverses PDF pages and builds Fitz display trees. We'd also need
a module for font discovery (unfortunately, quite platform specific),
and, eventually, one for text layout as well.
The problem is that support for packaging and versioning of libraries
is generally pretty painful. There are lots of opportunities for these
libraries to get out of sync, and many more testing permutations,
especially if people are trying to use different versions of the same
libs at the same time. Also, I worry that the fine-grained
factorization might be confusing to users ('how come, in order to
display a JPEG image, i use mg_ functions to create the JPEG parameter
dictionary, pass that into an sr_ function to create the JPEG decode
filter, and plumb the result of that into an fz_ function to draw
it?') There are also some fairly difficult decisions about where
certain logic should live. A good example is PDF functions. There's a
good argument to put them in Fitz, but it's easy to imagine it in the
PDF semantics module as well.
A related question is whether language bindings should be shipped as
part of a library, or as a separate module. My experience has been
that separate language bindings are often very painful to use, because
of subtle version mismatches. The bindings tend to lag a bit, and
they're much pickier about which exact lib version they're linked
against than your average app.
There are other intermediate stages between the two extremes, but it's
not yet clear to me whether any are clearly better. One such
possibility is to have a single common namespace, but a bunch of
smaller lib files so you only link the pieces you need. In the other
direction, we could keep the source highly modular, with separated
namespaces as above, but mash them all together into a single library
as part of the build process (in fact, we'd probably want to do this
anyway for Windows targets).
I know a lot of other projects struggle with the same issues. For
example, 'ldd nautilus' spits out no less than 57 libraries on my
system. Of these, glib corresponds fairly closely to the magma layer
above, and is used by many (but not all) of the other libs. Perhaps
coincidentally, many users find that building and installing Gnome
apps is difficult.
At the other extreme, I've noticed that media player tarballs tend to
include codecs and suchlike in the source distributions, often tweaked
and customized. Mplayer-0.90pre8 has 11 subdirectories with 'lib' in
the name. The advantage is that building mplayer is fairly easy, and
that (barring a goof-up by the producer of the tarball) versions of
the libraries always match the expectation of the clients. The
disadvantage, of course, is that mplayer's libmpeg2 is not shared with
transcode's or LiViD's. Also, it's harder to do something like install
a new codec on your system that will just work with all the players.
Perhaps, over time, it will become less painful to distribute code as
a set of interdependent libraries. In the meantime, we have to strike
the right balance between keeping our codebases and development
processes modular, and keeping life pleasant for users. I'm not sure
the best way to do it, so I'd appreciate hearing the experiences of
others.