HOWTO: Unchain Yourself from Proprietary Formats
Today being Document Freedom Day, I’m taking stock of how unencumbered my digital lifestyle is — both on the consumption as well as on the production side. I’ll try and explore alternatives for each category. But before that, one must first explore why proprietary and patent-encumbered formats are bad,
- Patents — if some entity holds patents that apply to a format, your ability to distribute your files might be compromised by the need to pay patent royalties. Even if the patent holder covenants not to exercise enforcement of the patents, they or the patents could end up being bought, at which point who knows what could happen. Even Microsoft got into trouble with Alcatel Lucent; the case was later thrown out of court but only after a headline-grabbing $1.52 billion award of damages was initially awarded. And Alcatel is not even a patent troll! Best protection is to use software licensed with a retroactive patent retaliation clause (e.g. Apache license, Eclipse Public License, GNU Public License v3) and whose copyright holders and distributors are in a defensive patent pool such as the Open Invention Network
- Format obsolescence – even NASA has had trouble reading precious sensor data from old punch cards and magnetic tapes generated by previous missions, because the documentation for the file formats have been lost!
- DRM — Digital Rights, or Restrictions, Management, depending on which side of the coin you’re looking at it from. It’s not impossible to create a DRM policy that is flexible enough to guarantee you your fair use rights you enjoyed with older analog technologies — the printed book, the audio CD — but it’s not in the interest of (most) publishers and distributors to do so. Unless forced by regulation or unless you vote with your wallets.
- Walled gardens — remember the pre-Internet days of AOL, CompuServe, etc.? We still have walled networks, they are just built on top of the Internet instead.
Reports, presentations, … you get the drift. I try and stick to LaTeX for the first two, if possible (generating read-only PDF for dissemination) and OpenDocument Format, authored on LibreOffice, if collaborating with a non-LaTeX-user. And for spreadsheets.
Why, you might ask? Well, LaTeX just typesets much more beautifully than other alternatives I’ve seen. It’s a solid, well-understood format, and have very few compatibility problems over the years. Compared to Microsoft Office formats — Word’s .DOC being the most notorious, with worms and newer versions not being able to render old files perfectly! Microsoft’s “Office OpenXML”, their new file format, only became a standard after a process as dubious as Japan’s sponsoring of landlocked countries to join the whaling commission to supplant its voting block. And the standard is not even implemented by Microsoft itself.
Most podcasts are published in MP3; some are available in the patent-free Ogg Vorbis (.ogg / .oga) format — sadly, mostly limited to free / libre / open source software (FLOSS). A rare few are available in MPEG 4 Audio / AAC (.m4a).
I try to subscribe to the Ogg feeds whenever possible. MP3 is patent-encumbered, and the last patent won’t expire until 2017; while users of open-source MP3 decoders have not been sued for infringement yet, the situation is legally uncertain enough that, if you look at your favorite large Linux distribution (be it Debian, Fedora, openSUSE or Ubuntu), none of them carry MP3 decoders, let alone encoders, in their main repositories. Even Microsoft has been hit by frivolous MP3 patent infringement lawsuits!
Consuming MP3 even when an Ogg feed exists for the same podcast would perpetuate the use of MP3 “because that’s what the customers want”. Buying a portable music player that cannot play any patent-free, open-standard compressed format (yes, I am talking about Apple’s iPod here) is even worse. It does not play Ogg Vorbis, it does not play FLAC; instead, apart from MP3 you get non-free formats that Apple is heavily involved with: .m4a, the DRM-locked variant .m4p, and Apple Lossless, which Apple invented for unfathomable reasons instead of using FLAC. It’s not even because they can then engineer DRM support; you can graft it on top of FLAC as well, if you want.
MP4 situation is similar to MP3, except that given that it’s a newer technology, the patents will take even longer to expire.
I try to buy electronic, rather than paper, books whenever possible. With a semi-nomadic lifestyle, buying more physical books just makes moving more costly! And there’s the environmental aspect as well. It’s bad enough that we need to clear forests to grow enough food — and eating excessive amounts of meat makes the matter worse because of the reduced energy efficiencies involved in adding another layer of intermediary between the sun and ourselves — it’s even worse when one unnecessarily gets documents in printed form. Now, used books are another matter altogether.
For eBooks, tech publishers like (in alphabetical order) No Starch Press, O’Reilly and The Pragmatic Bookshelf gets the nod for providing DRM-free products, with errata updates, in the major formats (ePub, the eBook standard; Mobi, an older format supported because Amazon’s Kindle uses a DRM-encumbered version if it, .azw, and does not read ePub; and PDF, for faithfull reproduction of the original layout).
Outside of programming references, alas, most publishers are not as enlightened. I must confess to being a heavy Amazon Kindle user, despite its limitations — not being able to lend my books out without restraints for one, not being able to hand over ownership is another. But at least I get to read the books on all my devices, unlike Apple’s iBooks with its five devices per book limit. Kind of nice having Amazon backing up the purchased books in case I lose a device, too; they’re starting to do that for music as well, though only in the US. Hear that, Apple?
For many years, the only patent-unencumbered format available is Ogg Theora (which started its life as On2′s VP3 codec). On2 has since been bought by Google, and their latest VP8 codec becoming the basis of WebM, which is roughly equivalent in quality to MPEG4.
I try to get my videos from sites that allow videos to be downloaded (if the uploader allows for it) — e.g. blip.tv. and vimeo. Revvr, another service featuring this, sadly was a commercial flop and is no longer available. These sites allow you to download videos, dating back to the time when YouTube not only does not allow that, as they still does not, but also limit videos to lengths of 10 minutes! There are workarounds to downloading YouTube videos, but officially you’re not allowed to do that.
i.e. Twitter, Facebook, etc. I have switched to posting mainly to identi.ca, a Twitter alternative running software from Status.net. Unlike Twitter, it supports federated social networking — you can talk to people running independent Status.net installations, much as in instant messaging, users of XMPP networks (various Jabber networks, Google Talk, LiveJournal, Facebook) can communicate with each other). identi.ca lets you push your updates to Twitter as well (even retweets and favorites) and to Facebook, so I really only post directly to Twitter if replying to a Twitter-only user (or the topic is too mundane and I want to keep identi.ca’s SNR high).
Facebook alternatives are not as mature yet; see EFF Deeplinks’ post on this topic for more information.
Online file storage
Here’s a mea culpa – I’m a happy Dropbox user (thank you, Dropbox, for supporting Linux clients — at least on x864 architecture). This is the exception that proves the rule, however: apart from Dropbox, I try to stick to online file storage solutions that at least have open-source clients for communicating with the servers and accessing your data. e.g. Google Apps (including Google Docs) are accessible using googlecl, which uses the Google Data APIs; both are open source. If you move to a platform that is unsupported, you can port the software yourself.
With googlecl, if one is paranoid, one could encrypt all one’s files (with, say, GnuPG) before storing them on Google’s cloud. Hmm, that would be an interesting project to attempt…