Tor and I have been working a bit more on the Fitz design. We have to nail down a number of open decisions about coding style and the like. We don't want to cut a lot of code, and then find we need to redo big chunks. Our current draft is on the Wiki under CodingStyle. Many of the decisions are somewhat arbitrary, but even so aesthetics are important. We want to be able to look at the code with pride.
One of the most difficult issues is how to split up the code into modules. What level of granularity is best? The Ghostscript codebase tends to be fairly monolithic, and a large part of our goal is to refactor it into independent modules.
Clearly, a full featured PDF app will use all the modules, but it's also easy to imagine more lightweight clients that just use some of them. Perhaps an instructive example is a PDF validity checking tool (known as "preflight" in the graphic arts world). Such a tool has to parse PDF files and process the PDF streams, but need not actually construct a display tree for rendering.
One obvious approach is to make a giant hairball that contains everything. Clients just link in the library, and use what they need. There's little added complexity in the build and packaging processes, and there's no chance that the individual pieces will get out of sync with each other. However, it's not very elegant.
Another approach is to split everything into the smallest sensible modules. An immediate problem is that many of the modules will want to share infrastructure, particularly having to do with the runtime. For example, one of the things we're hammering out is a "dynamic object" protocol incorporating strings, lists, dicts (hashtables), names (atoms), and numbers. These kinds of objects show up all the time in PostScript and PDF documents, and are a handy way to pass parameters around. If we parse such an object out of a PDF file, and want to pass it as a parameter to the filter library, it would be really nice for the type to match.
So, in the "many small libs" scenario, I think there would be one base library ("magma") containing shared runtime infrastructure: at first, just memory allocation, exception handling, and dynamic objects, but possibly also loading of dynamic plug-ins and maybe threading support. All the other modules will allocate their memory and throw exceptions in a magma context, and pass around magma dynamic objects as needed.
The filter library would be the first such other module. It's small, very well defined, and will probably be quite stable once it's done. The Fitz tree and rendering engine would probably be the biggest, and see intense development over a period of time. Other obvious modules include a low-level (syntactic) PDF parser, and a higher level module that traverses PDF pages and builds Fitz display trees. We'd also need a module for font discovery (unfortunately, quite platform specific), and, eventually, one for text layout as well.
The problem is that support for packaging and versioning of libraries is generally pretty painful. There are lots of opportunities for these libraries to get out of sync, and many more testing permutations, especially if people are trying to use different versions of the same libs at the same time. Also, I worry that the fine-grained factorization might be confusing to users ('how come, in order to display a JPEG image, i use mg_ functions to create the JPEG parameter dictionary, pass that into an sr_ function to create the JPEG decode filter, and plumb the result of that into an fz_ function to draw it?') There are also some fairly difficult decisions about where certain logic should live. A good example is PDF functions. There's a good argument to put them in Fitz, but it's easy to imagine it in the PDF semantics module as well.
A related question is whether language bindings should be shipped as part of a library, or as a separate module. My experience has been that separate language bindings are often very painful to use, because of subtle version mismatches. The bindings tend to lag a bit, and they're much pickier about which exact lib version they're linked against than your average app.
There are other intermediate stages between the two extremes, but it's not yet clear to me whether any are clearly better. One such possibility is to have a single common namespace, but a bunch of smaller lib files so you only link the pieces you need. In the other direction, we could keep the source highly modular, with separated namespaces as above, but mash them all together into a single library as part of the build process (in fact, we'd probably want to do this anyway for Windows targets).
I know a lot of other projects struggle with the same issues. For example, 'ldd nautilus' spits out no less than 57 libraries on my system. Of these, glib corresponds fairly closely to the magma layer above, and is used by many (but not all) of the other libs. Perhaps coincidentally, many users find that building and installing Gnome apps is difficult.
At the other extreme, I've noticed that media player tarballs tend to include codecs and suchlike in the source distributions, often tweaked and customized. Mplayer-0.90pre8 has 11 subdirectories with 'lib' in the name. The advantage is that building mplayer is fairly easy, and that (barring a goof-up by the producer of the tarball) versions of the libraries always match the expectation of the clients. The disadvantage, of course, is that mplayer's libmpeg2 is not shared with transcode's or LiViD's. Also, it's harder to do something like install a new codec on your system that will just work with all the players.
Perhaps, over time, it will become less painful to distribute code as a set of interdependent libraries. In the meantime, we have to strike the right balance between keeping our codebases and development processes modular, and keeping life pleasant for users. I'm not sure the best way to do it, so I'd appreciate hearing the experiences of others.