Older blog entries for rlk (starting at number 13)

18 Sep 2002 (updated 18 Sep 2002 at 00:29 UTC) »
Inside the tornado!

Apple has picked up on CUPS and Gimp-print in a big way. Mac OS X 10.2 is using CUPS as the core of its printing system, and Gimp-print is providing a lot of the drivers.

They aren't actually bundling Gimp-print with OS X, but when Phil Schiller (top marketing guy) does a keynote at Seybold, and mentions Gimp-print for more than a few seconds, they're not exactly ignoring it. It's being made available for download on all of the OS X sites, and on VersionTracker.com it's getting (with very few exceptions, mostly related to the fact that until extremely recently Ghostscript wasn't available to handle applications generating PostScript output) rave reviews. One person even said that Epson pointed him at Gimp-print for his 2200, because they don't know when they'll have their own driver.

It's absolutely astounding how all of this happens. I certainly never imagined it would go mainstream to this degree; I expected uptake by Linux distributions, but since it's a piece of infrastructure I expected it to remain largely invisible. Reality is very different.

My sister in law has told me several times that she thinks I'm crazy for not having patented it and made my fortune selling it. Aside from the fact that it's rather difficult to patent a printer driver, and I'm a free software fan in general, it's interesting to note that there are a few proprietary driver packages for Linux/UNIX: Xwtools and PrintPro. The developer of Xwtools is actually making pieces of it free/GPL (in particular the excellent Epson Stylus maintenance utility, which is far nicer than our own escputil), and PrintPro has steady, but not spectacular sales in the corporate arena. However, neither of these packages has really taken off. I'm convinced that the free part is a direct cause of the developer interest, which is what's needed to create end user interest.

It's interesting to note how many users are really unhappy with printer vendors. From my own discussions, I believe that the problem is with the OS vendors, who historically keep changing their driver architecture, and with the printer vendors, who don't spend much on sustaining for their older models. I can't entirely blame the printer vendors; sustaining is very expensive and unpleasant. However, it's very apparent that at least some printer vendors essentially write new drivers for every printer model; the Windows GUI for the Stylus Color 800 is very different (and clearly much older) from that of the C80. A data-driven approach would help here.

This whole thing's actually really wild, and very exciting.

The functionality/architecture tradeoff

One thing that's disappointed me is that we haven't really looked all that hard at our internal architecture over the past few years. We've certainly cleaned up a lot of interfaces, but we've never moved toward the much more data-driven architecture I envisioned. This is as much my fault as anyone's; I haven't had a lot of real moments of inspiration on this front.

Most of my effort has gone into nibbling around the edges: tweaking the Epson family metadata schema to support newer printers, improving color fidelity in minor ways, playing around with dithering, and the like. There have been a lot of other sub-projects, such as a user's manual, that also aren't architectural in nature; most of the architectural work has gone toward improving the build system for both the code and documentation.

I'm becoming more and more convinced that this is actually a reason why Gimp-print has succeeded. Most of the work has gone into things with visible end-user effect; documenting how to actually use it,supporting more printers and improving quality is something people notice; low-level architecture isn't. Also, the base is stable; there have been zero changes to the API (as determined by commits to the gimp-print.h header file) in the 4.2 series, and exactly one minor addition (which may have important ramificationns for color management on photo printers later) thus far in 4.3. Is the API perfect? No; there are certainly things I'd like to change. But it's good enough, and probably a lot better than existing printing facilities can really use.

The 4.2 releases really have been stable -- 4.2.1 came out fully five months after 4.2.0; it did contain a number of bug fixes, but they were mostly corner cases. It also contained some new functionality: the IJS driver. 4.2.2 was almost five months after 4.2.1. It contained some more bug fixes (some rather important ones), but none were regressions. Most of its new functionality was support for new Epson printers. Of course, there have been prereleases and release candidates, but I'm overjoyed how stable we've managed to keep 4.2. 4.0 had five releases in the first month -- fortunately, 4.0.4 was solid -- but it's evident that we've matured as a project.

Color management

I discovered not long ago that the API can actually support color management, thanks to the 16-bit CMYK input method (the 8-bit CMYK and RGB inputs could too, but it wouldn't be as effective). We've done some experimenting, and discussion on the development list has heated up recently, so somebody (maybe even I) will come up with a prototype. There has already been one prototype (it isn't very practical in its current form for various reasons) that has demonstrated really dramatic results for certain colors that Gimp-print currently has difficulty with.

Complete control over all ink channels!

So I finally put code into Gimp-print to provide a new input mode that gives full control over all physical channels. This is like CMYK, only more so (if you have a 6 or 7 color printer). Somebody with a spectrophotometer can make really good profiles and use all six colors optimally.

I don't have time to write more about this right now, but if anyone wants to play with it, it's on the "generic_color_branch" in our CVS.

Gimp-Print 4.2.1 is out

This has been a very long update cycle; it's been about 5 months since 4.2.0! I guess that that means that 4.2.0 is pretty good. We've added an IJS driver and an OS X port; other than that, it's basically just bug fixes.

It's not going to be that easy to maintain 4.2, with some of the new printers coming out (Epson's releasing a 7-color printer; the 4.2 code has no really clean way to support a light black ink). Besides which, 4.3 has Even Tone screening. Mark Tomlinson has been hacking on it. He's finally cleaned up the bulk of the waterfall problem, and the quality is really, really impressive. The API actually hasn't changed yet, so we might actually want to consider using the mainline as the base for 4.2.2. Definitely something to chew on...

EvenTone screening

Seriously good stuff. Mark has been doing a lot of work tuning it. The combinations I've tried have all been significantly smoother than adaptive hybrid; this is actually most noticeable at high resolutions (I'd have expected the opposite).

The one serious problem we have left to solve is "waterfalling". This is a common problem with error diffusion algorithms in general; it takes a while to build up sufficient accumulated error to print anything at all. The result is that there's a strip or small region where there should be a low density of ink that has none at all. I tried eliminating it last night by seeding the error with pseudo-random values to try to perturb some points over the threshold. It worked, but it ruined the smoothness of the texture.

The next thought I have is to perturb the threshold value in a similar pseudo-random fashion (specifically, using a carefully constructed dither matrix). The amount of the perturbation can vary, depending upon the density at the point in question. Waterfalling appears to only be an issue when the ink coverage is less than about 1%; perhaps we can find a way to work with that.

Thanks for the code drop, raph. I've forwarded it along.

Tandem screening is something we've been talking about, and taking occasional stabs at, for quite a while. We haven't managed to get it right, though, and this sounds like it might be just what we're looking for.

My thought at this point is that we should tune ETS, or better yet EBS, to be our high end, "quality at all costs" dither algorithm. Therefore, I would accept a significant speed penalty for its use, and we'll probably turn on all of the quality options. With contemporary commodity processors being in the 1+ GHz range, there shouldn't be any problem for desktop users.

For general high quality, Adaptive Hybrid already works very well, and there's no good reason to get rid of it. It has a lot of history behind it, and there's no substitute for experience. It's also reasonably fast; I think a 200-300 MHz CPU should be able to keep up with the printer speed.

Color management is something I've done some thinking about. The problem with all of the solutions I'm aware of is that they're CMYK-based. For four color, single dot size printers that's fine, but most high end printers are six color and/or variable drop size. If the drop sizes and relative densities aren't correctly tuned -- and that's what I think Raph was seeing with the inconsistency between 1440x720 and 720x720 -- no photometric correction method based on a relatively small sampling of points can correct the errors.

From an implementation standpoint, we can do at least as well as anyone else by utilizing the 16-bit raw CMYK input. This bypasses all internal color management, and simply requires the image source to supply 16 bits per channel. The logical extension from this is to have a printer input space (1 channel per printer output channel). Beyond this, we'd have to look at how to represent the dot sizes. Tuning dot sizes on variable drop printers is actually rather tricky.

6-color transitions are indeed a very important part of modern inkjet printing, and it's something I believe a lot of people have had difficulty with. It wasn't until quite late in the 4.1 development cycle that we figured out what needed to be done. Essentially, what we did was to screen the dark dots within the composite channel using the same algorithm -- and indeed the very same blue noise mask -- that we use to decide whether to print dots at all. This was not a trivial insight; for a long time I thought that using the same mask (without shifting it to decorrelate dot selection from ink selection) would result in something like a quadratic curve in the transition zone. It was when I figured out how to correct for this that it was possible to get a really smooth transition.

I initially thought that the ETS was improving the smoothness through use of less dark ink. We've been having a lot of internal discussion on this; there is some fear that we're not using enough dark ink. But when I looked at it more closely, that wasn't the case at all. I can adjust the transition zones even with adaptive hybrid, and this didn't help at all. The problem is, just as Raph hypothesized, that the blue noise screen isn't as smooth as ETS. In fact, with ETS the transition is a bit more visible due to the overall greater smoothness, but I still need a fairly strong loupe to see what's actually happening.

Anyway, it looks as though 4.2.1 will contain an IJS driver, so we can finally solve the problem of building Gimp-print into Ghostscript. I'm actually quite pleased it's taken this long for 4.2.1 to come out; it's an indication that we did a much better QA job on 4.2 than we did on 4.0. It took us five releases of 4.0 to get things stable; 4.0.4 was the first 4.0 release that didn't have any stoppers. There really haven't been any critical bugs in 4.2 that have necessitated a fast update, so we can go ahead working on development and back port things into 4.2 as desired. Of course, the downside of the relative maturity is that things are happening more slowly.

I don't know yet exactly what we'll do with EBS, but it sounds like it does a lot of the things that we've been thinking about for a while.

And things keep rolling along...

A new member of the gimp-print development team, Mark Tomlinson, is working on integrating raph's EvenTone dither algorithm into the project. It's still only on the mainline (development), but it's showing absolutely spectacular promise. I thought our Adaptive Hybrid dither algorithm was pretty good (and it certainly is very good, when compared against a lot of others that I've seen), but with the exception of a few specific problems it looks like we're going to have a new flagship dither algorithm fairly soon.

The big improvement is in smoothness. This is particularly noticeable in solid color midtones, but it's also noticeable in some line art, such as the 1 degree spaced radial lines in the CUPS test page. With adaptive hybrid, the lines look somewhat rough; with EvenTone, the lines are absolutely smooth to within the limits of the printer's resolution. The results at 1440x1440 on my Epson Stylus C80 are astounding -- at 1440x720, the 720 DPI vertical resolution is perceptible as very fine stairstepping in the almost-horizontal lines; the 1440 DPI resolution is not. Shame on Epson for underselling the true capability of that printer! They only advertise it as 2880x720. It's really capable of 2880x1440, and there's a real use for it!

There are still a few problems. There's still some waterfall effect in very pale areas (very pale regions near a dark boundary have some separation between the boundary and the printing), but it's much better than most error diffusion I've seen. There are some odd artifacts at 720 DPI on the CUPS test page; that's probably a discrete logic bug somewhere that should be easy to fix. And finally, there's some roughness in some light midtone range (I think about 5-10% density). Those issues notwithstanding -- and I have every confidence that they'll get fixed -- this is really something.

I don't want Gimp-print to just be the best quality free printer driver package. I want it to be the best, period. That means better than the OEM drivers. Quite a few people think that Gimp-print's quality is better than Epson's own drivers, especially in terms of color quality, but judging by what I'm seeing there's still room for improvement. If the way to get the very highest quality output is to use free software, then we suddenly have a much stronger position on the desktop. And that's good for everyone.

This probably isn't going to make 4.2.1; that's going to just fix some bugs and probably improve the situation with Canon printers. It's probably a few months of hard testing and such away from beta release; currently it only supports one kind of output (CMYK from CMY input; it also needs to support grayscale, CMY, and raw CMYK). However, if you want to try it, it's on our development CVS. Maybe I'll do 4.3.0 around the time we do 4.2.1, for people who want to experiment.

Look at the CMYK printing in my previous diary entry. That's something that can't be done with the OEM driver.

CMYK!

Yesterday I helped someone print a CMYK tiff to an Epson Stylus Pro 7000 printer under Linux.

That doesn't sound like a big deal, you say. Gimp-print has printed to Epson Stylus printers for a few years now, and who really cares about CMYK? Answer: the Stylus Pro printers are Epson's high end professional printers, and a lot of professionals like the extra control that CMYK gives them. Not to mention, Epson's own driver doesn't handle CMYK input.

We did this using CUPS. I had to fix a few bugs in CUPS; one of them may have been a bug in libtiff; the fix was a one liner, to enable CUPS to handle the CMYK TIFF. The other one was more substantive -- CUPS was converting the CMYK to RGB, only to convert it right back to CMYK. Note that CMYK is four channels, and RGB is only three. That meant that information was getting lost. In this case, what the file in question did was print 100% C, M, Y, and K. The conversion to RGB made it 0% R, G, and B; and the conversion back made it 100% K, and 0% C, M, and Y. Not an insignificant difference.

This is seriously cool to professional users. There used to be an affordable RIP (raster image processor) called Adobe PressReady, but it never supported the high end printers, and Adobe has since discontinued to it (speculation was that some of Adobe's partners, who sell much more expensive RIP's, didn't like the competition). Whatever the case may be, there's at least the germ of another way to do it. Ghostscript is, after all, a Postscript RIP (among many other things), and if it can do CMYK, it means that we, the free software community, can now start to compete in the graphic arts field.

That's not to say that we're anywhere near all the way there. Without color management, we still miss something critical -- the ability to map input colors to screen colors to output colors, and close the loop. So we're currently operating open loop. But we're making strides.

Gimp-Print 4.2 is released!

Gimp-Print 4.2 is finally released. Of course, it took longer than expected; there were the usual last minute bugs.

It has certainly turned out to be a much bigger release than I had planned on, though. My expectation was that we'd clean up a few of the nastier problems in 4.0, prior to moving on to rearchitecting it. What's actually happened is somewhere in between. We didn't redo the color and dither code from scratch, and redesign how metadata is handled, but we did considerably more work than I envisioned. So I suppose that's why it took 13 months rather than 6-9 months. Was it worth it? Probably. Is 4.2 a good release? I certainly think so.

Dave Winer doesn't care for Richard Stallman

There are a number of things about this article in Scripting News that rub me the wrong way:

  • The claim that Stallman doesn't push the envelope. Exactly what does Stallman have to do to be taken seriously as a developer? I mean, he's only been the principal architect behind multiple versions of Emacs, the driving force behind gcc, the GNU project in general (which may be an indirect derivative of UNIX, but the tools are much better), and who knows how much else that I can't think of. Does he have to produce some grandiose new piece of software, ab initio, every year? I read an article in Circuit Cellar (which focuses heavily on embedded development) that was an introduction to the GNU toolchain. It sure seems like a lot of people in the embedded space like a compiler that's identical across so many platforms.

  • Open source projects are essentially open debating societies. I guess this one really sticks in my craw, because I've put no small amount of effort into building Gimp-Print into a real project. People do real stuff, be it coding or documentation, it gets into the source base, and people work together to improve things. KDE, meanwhile, does something truly amazing: they release a couple of major versions of a whole desktop environment, with applications, each year. And let's not forget the likes of Debian. Meanwhile, I think we've all heard of or seen plenty of commercial projects dissolve into bickering.

  • His evident resentment that Stallman won the Takeda Award. I guess RMS did something that the Takeda Foundation liked enough.

I get the feeling that there are a fair number of people out there who think that if you aren't out to make money, you should stand aside for people who are. Certainly Microsoft is explicit enough about that, but then there are family members who think I'm insane for not patenting (!) Gimp-print and making a fortune (!!) out of it. I guess the thought of actually wanting to give something back to the community doesn't cross people's minds.

Free software isn't going to destroy programmer's livelihoods. Maybe it will make it harder for people who want to turn out shovelware and make a quick buck, but somehow that doesn't bother me. Programmers should applaud the fact that they don't have to keep reinventing the wheel, and can build on the work (reciprocally) others do. Instead, people keep angling for barriers they can raise against competition. The success of software is measured by how rich it can make a handful of proprietors.

It's been a while

Gimp-print 4.2 is nearing release. It's hard to believe that it's been more than a year since gimp-print 4.0 came out; I didn't expect it to take as long. It's going to be a good release; we have a nice mix of new printers (particularly Epsons, since Epson is so much more helpful with what we really need), improved quality, and improved architecture. Roger Leigh, who joined after 4.0 came out after I found that he had done some work on an Epson utility, did a huge job creating a proper library from the core. And finally, Andy Stewart wrote a user's manual. That's one of the biggest improvements in 4.2. Pity docbook's so fragile, but it's worth it.

I would have liked to have done more architectural work, but 4.2 wasn't planned to be a major release, so we didn't expect to accomplish much. We accomplished more than we really planned, but we took a lot longer (I envisioned 6-9 months, and we're currently over 12). Some of it has been feature creep, but I don't think that that's the main problem; we've stayed within reasonable bounds. The big problem is that everything happens slowly, because nobody can really commit to anything, because paid work and such get in the way. The big architectural change happened early in the cycle, and all of the fallout was really over by early spring. Cleaning up the engineering of the documentation has taken some time, and a lot of that came in late, but that's as good a reason to hold a release as any.

We have 15 bugs open right now. The bug rate shot way up after beta-4, although some of that's simply pickier reporting on my part. There's one really critical one (more below), one that concerns me but I need to know how to try to reproduce it, two more easy fixes that I have out for review but that should never have gotten as far as they did before being found, and another one that just needs testing on a BSD platform. There are some other bugs that are definitely fixed, some that are low enough priority so we can go release without them (we've planned to, in fact), and so forth. So it's not quite as bad as it looks, but we do have a few stoppers yet.

Translation of PPD files

The nasty bug involves translation of PPD files. Another thing that we've done this time around is that we're translating Gimp-Print into a number of languages. One of the core gimp-print applications is a driver for CUPS driver. CUPS uses PPD files, and these files should be in the correct language for the user.

The PPD files are created at compile time, not at run time, and therein lies at least part of the problem. The internationalization mechanism (GNU gettext, or anything compatible) likes to find its message catalogs in certain places, and of course at compile time they aren't there. The POSIX-standard stuff seems to only like working in terms of valid locales, and there's no guarantee that the name of a language (e. g. "sv") will be the name of a valid locale. If it is, everything works; if not, it usually doesn't. The GNU gettext has an escape hatch: if you set the LANGUAGE environment variable, it just uses that without worrying about anything else.

The standard configure package has a --with-included-gettext option to use the gettext bundled with the package (in this case, we're bundling 0.10.40). I don't really like this, since it forces a fairly important decision on the user based on something that's only needed once (at compile time), but this would probably be acceptable, if it worked. Problem is, at least on the FreeBSD machine at Sourceforge, it doesn't! Even with --with-included-gettext, even with static linking, the PPD files just don't get translated! This causes six identical copies of the PPD files to get translated. When there are 133 PPD files per language (totalling 1 MB per language compressed), and when every one of these shows up in a menu for CUPS users, it's really quite unpleasant.

It's probably quite obvious from this that I am not an expert in this particular area. If anyone is, and would like to help us out, the bug in question is here.

Gimp-print 4.0

We're on the 5th minor release of 4.0 (4.0.4). There have been two bugs that we've fixed in the core (one for the HP 1200C, and one in the color conversion code); the rest of the bugs have been in the Ghostscript or CUPS drivers. That's about what I expected; the core has had a lot of work, while the drivers haven't been exercised as much. What's annoying is that we could have caught a lot of these through better testing. There were a few fairly nasty ones.

One issue that we're considering with 4.0 is how much to back port from the development mainline. There's one particular improvement (support for C-RET mode on higher end HP inkjets) that quite a few people want, but it needs more work on the mainline. Another issue is the color quality. There are some fairly serious color issues with 4.0: reds are rather orange, blues are rather violet, and greens are somewhat straw-colored. The improvement on the mainline is somewhat ad-hoc: a hue map. This does corrections in HSB space rather than RGB or CMY, but it does a remarkably good job of correcting the gross color errors. It won't correct other errors, though.

Gimp-print development

I consider the hue mapping described above to be purely experimental color correction. We really need a better color model. This will presumably happen over time. Nonetheless, reaction to even this has been very positive. I've really come remarkably close to matching my display.

Andy Thaller has been putting quite a bit of work into the Canon driver. He's borrowed the ink model from the Epson driver. As long as we can find enough testers, we should be able to greatly improve that driver. It still won't match the Epson driver very soon, because Canon won't give out the kind of information we need to put the printer in the very highest quality modes. Enough people have commented on the printer vendor situation, so I'll simply reiterate: printer vendors who hand out information will find that the free software community will supply high quality drivers for their printers, while printer vendors who don't will find the world passing them by.

We're also talking with the Omni folks from IBM about their approach to the metadata problem. There's a huge amount of information that goes into printing something, and we only expose a very small part of that to users. This is unfortunate; it makes it much harder for users to tune their printout, using all of the settings that the dither and color code export. The Omni team has put a lot of effort into the architecture of their system, and not as much into quality; we've put tremendous effort into quality, but our architecture is weak. There should be a lot of possibilities for synergy here.

We're also trying to decide what 4.2 should look like. I had envisioned it as a fairly large release, converting to an internal CMY(K) model, with an external metadata representation, and the like. I'm having a bit of a change of heart about this. Perhaps it might be better to make it a fairly conservative release, doing something (not necessarily the ultimate) about the color quality, fleshing out the drivers, and such, and hold the really big stuff for another round.

gtaylor is working on the Foomatic stuff so that we can version it. We've already diverged enough from 4.0 so that the 4.0 Foomatic database won't work with the mainline. What I really want, though, is to be able to generate the foomatic data files programmatically, rather than having to download the tarball from linuxprinting.org. In addition to being slow, and bulking up the release, it's not very flexible.

4 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!