Format(ting?) of Forever
Mark Pilgrim had a great post1 a little while ago where he talked about Docbook as ‘The Format of Forever’, but HTML as the ‘Format of Now.’ He also argued that (since technical books were constantly outdated) generating technical books in the Format of Now instead of the Format of Forever made a lot of sense.
I’m working on a project that I’d like to see as a long-term, Format of (nearly) Forever kind of work. Specifically, it is my grandfather’s autobiography, which I’d like to see as a long-term enough work that I can give it to my own grandkids some day. As a result, I’ve been wrestling on and off with two questions: (1) what is the right ‘Format of Forever’ and (2) once you’ve chosen that source format, what is the best ‘Output Format of Now’? Thoughts welcome in comments; my own mumblings below.
Grandpa, of course, wrote in the ultimate in formats of forever: typewriter. I scanned and OCRed it shortly after he passed away using the excellent gscan2pdf2, and have been slowly collecting other materials to use to supplement what he wrote – mostly pictures and scans of his Apollo memorabilia, but also family photos, like Grandpa’s Grandpa, Lewis Hannum, pictured above.
I’ve converted that to what I think may be the right ‘Format of Forever’: pandoc markdown, plus printed, easily re-scannable hard-copy. I’m thinking that markdown is the right source for a couple of reasons. Primarily: plain, simple ASCII text is hard to beat for future-proofing. Markdown is also easier to edit than HTML3.
The downside with markdown is that, while markdown is terrific for a very simple document (like grandpa’s writing is) I’d like to experiment with some slightly non-traditional media inclusion. For example, it would be nice to include an audio recording of my brother at the 1982 Columbia Shuttle launch, or a scan of Grandpa’s patent. Markdown has some facilities for including other files, but they appear to be non-standard (i.e., each post-processor handles them differently). Even image inclusion and basic formatting often feels wonky. HTML would make me happier in that direction, I suspect. And of course styling the output is a pain, though I think I have various ideas on how to do that.
- vanished since I originally drafted this, but link kept for reference
- Which, for the record, was roughly 1,000 times better than Canon’s bundled scanning crapware.
- which is sort of pathetic; how come we still don’t have a decent simple HTML editor?