unexpected Switzerland
DIY crowdfunding and bitcoin
Well, my git-annex crowdfunding campaign is half way to its August 15th conclusion. So far it's raised more than five times what I hoped it would. I wish I could say I'm like some canny NASA engineer who intentionally sets low expectations for their Mars rover, but in both the previous kickstarter and this campaign I've really had no idea how far it'd go. I'm glad that I'll be working on git-annex for another year.
I was particularly unsure if it'd be successful to move off Kickstarter. During the git-annex assistant Kickstarter campaign, I saw many small contributions from people who learned of it due to it being a successfully funded project, a staff pick, etc. Losing that easy network effect is a gamble.
So far I've had only half the number of contributors that I got on Kickstarter. I've basically missed out entirely on the $5 level casual contributors. On the other hand, my backers have generally been more generous (and some have been exceedingly generous). And I've avoided rewards that will cost much money, so I may end up in the same ballpark funding level in the end!
Incidentially, I'm really enjoying getting in touch to let people know when I make their sponsored commits. There's still time to sponsor one of your own ;)
I also was curious to experiment with Bitcoin in this campaign. Partly because Paypal isn't available everywhere internationally, and takes really obnoxious percentages of transactions (though probably not as bad as Kickstarter taking its percentage followed by Amazon payments taking its percentage..) and partly because there seem to be interesting possibilities for supporting free software with Bitcoin. (Especially if any of the microtransactions on top of Bitcoin take off.)
So far 5% of backers have used Bitcoin. It's been quite strange to actually have significant amounts of bitcoins in my wallet. Wordpress has had 94 bitcoin payments over 9 months since starting accepting them. I've had 47 payments in the two weeks my campaign has run so far. Wow!
Most of the bitcoin payments have come in via Coinbase (a few people have found my direct payment address), but of those very few were using bitcoin purchased on Coinbase. Most are probably transfers of bitcoin they already had, or perhaps bitcoin purchased on other sites.
The one technical issue I've had with using bitcoin is that Coinbase has not provided details about who sent most of the donations. Probably some of them are intentionally anonymous, but I suspect Coinbase's interface to claim incoming bitcoin transactions failed for some of them. (If you donated bitcoin and want to actually get a reward, please email me.)
By the way, I'm converting most of the bitcoins back to USD pretty quickly. I'm not interested in speculating on currency exchange rates with money that has been donated so I can accomplish a particular task..
I put up the campaign website without any means in place to handle updating it. This is because I never automate anything until I've done it at least 10 times by hand. ;) After the first trickle of donations became a flood, I quickly realized I needed at least something to handle keeping the numbers straight.
What I whipped up in an hour of coding is a system where I enter incoming payments into a hledger file and a small haskell program parses that and writes out various files that are included into the website. Amusingly the percentage calculation and display code was copied from git-annex, so part of git-annex is helping run its own fundraising campaign. The campaign video is itself hosted in a public git-annex repository, come to think of it.
The rest of the site is built using ikiwiki. Given that it's hosted at Branchable, this is a high level of dogfooding and DIY. There are certianly better crowdfunding platforms, but all I miss in this one is automated transaction entry. And I have total flexability, double entry accounting, and a powerful static website generator that handled being on the top of Hacker News without a sweat. Oh, and some money. What's not to like?
git-annex as a podcatcher
As a Sunday diversion, I wrote 150 lines of code and turned git-annex into a podcatcher!
I've been using hpodder, a podcatcher written in Haskell. But John Goerzen hasn't had time to maintain it, and it fell out of Debian a while ago. John suggested I maintain it, but I have not found the time, and it'd be another mass of code for me to learn and worry about.
Also, hpodder has some misfeatures common to the "podcatcher" genre:
git annex addurl
to register the url where a file came
from, so when I check files in with git-annex after the fact they're
missing that useful metadata and I can't just git annex get
them
to re-download them from the podcast.So, here's a rethink of the podcatcher genre:
cd annex; git annex importfeed http://url/to/podcast http://another/podcast
There is no database of feeds at all. Although of course you can check a list of them right into the same git repository, next to the files it adds. git-annex already keeps track of urls associated with content, so it reuses that to know which urls it's already downloaded. So when you're done with a podcast file and delete it, it won't download it again.
This is a podcatcher that doesn't need to actually download podcast files!
With --fast
, it only records the existence of files in git,
so git annex get
will download them from the web (or perhaps from
a nearer location that git-annex knows about).
Took just 3 hours to write, and that's including full control over
the filenames it uses (--template='${feedtitle)/${itemtitle}${extension}'
),
and automatic resuming of interrupted downloads. Most of what I needed
was already available in git-annex's utility libraries or Hackage.
Technically, the only part of this that was hard at all was efficiently querying the git repository for a list of all known urls. I found a pretty fast way to do it, but might add a local cache file later on.
guesting on GitMinutes
Last Friday, I spent and hour and a half clamping a landline phone to the side of my head, while also wearing a headset. I was recording an interview on the GitMinutes podcast about git-annex.
I've been listening to GitMinutes for a while, ever since I heard git-annex
was mentioned on it. Actually, I think it's come up in 4 or 5 interviews on
the podcast. Most notably with core git dev Peff King, who had some
interesting things to say, for sure. I responded to that in my
interview, and we covered quite a wide amount of stuff in reasonable depth
in just over an hour. Thomas is quite a good host and great at drawing
stuff out, and it's nice to not need to worry about going into too much
technical depth. (Although I didn't get a chance to explain the automatic
union merging used to maintain the git-annex
branch.)
This is the first podcast I've been in, and I've always worried about audio quality if I was in one. That is, I'd want it to be really good, and probably end up annoying the host. ;) For this one, we settled on using my land line call, which went through some Skype thing to get the Europe, and mixing in a local recording I made with a not too great headset. I think the result is pretty good, considering.
You can listen to the whole thing here, if you dare! (1 hour 8 minutes) http://episodes.gitminutes.com/2013/07/gitminutes-16-joey-hess-on-git-annex.html
(Special bonus guest: The songbird that lives on my porch.)
git-annex fundraising campaign update: Initial goal reached in a mere seven hours. I will be developing git-annex fulltime for at least the next three months! Gone to stretch goal.
Also it made the top of Hacker News: thread
New git-annex crowdfunding campaign
Having reached the end of my Kickstarter funded year working on the git-annex assistant, I've decided to try one more crowdfunding campaign, to see if I can get funded to work on it a little while longer. I went back and forth on this for a while. The Kickstarter funded development was extremely successful (one of the most productive years of my life). I certianly want to work on git-annex more, and have lots more stuff to do, particularly around security and encryption. On the other hand, it's hard to frame ongoing development as a normal Kickstarter campaign to start something new.
Anyway, I've decided to go ahead and try it, and not do it through Kickstarter this time. So I have my own website set up and accepting donations, and hopefully I'll make enough to spend a few more months working on git-annex.
... And if not, I'll probably spend a few months working on git-annex part time, while looking for paying work with the rest of my time.
By the way, I'm taking payments in both US Dollars (via Paypal) and Bitcoin (via Coinbase). Can't wait to see how this works out!
git annex and my mom
[I'm encouraging git-annex users to post their success stories, and this one is my own.]
I set up git-annex on my mom's and sisters' computers a couple of months ago. I noticed this was the first software I've written that I didn't have to really explain how to use. All I told them was, put files in this folder, and the rest of us will be able to see them. Don't put anything too big in there, or anything you don't want others to see.
I paired the computers using XMPP, and set up an encrypted transfer repository using a free account rsync.net gave me for beta testing. I also added a repository on my server, which made things more robust. (XMPP has since improved, but it's still a good idea to have a git repository to suppliment XMPP.) I also have two removable drives that are used to back up our files.
This was all set up using the webapp. And adding a computer takes just a couple of minutes that way. I set it up at my sister's in a spare moment during a visit, and it all just worked.
Our shared git annex contains a couple of hundred files, and is a couple of gigabytes in size. And growing pretty fast as we find things we want to share. Mostly photos and videos so far but I won't be surprised to find poems and books pop up in there from the family's poets and authors. And it'll grow further as I add people who've so far been left out.
Coming home from a week at the beach with my grand nephew and niece, was the first time I really used git-annex without thinking about it. Collapsed on a hotel bed, I plugged in my camera and loaded in the trip's photos. Only to see the hotel wifi cost extra. Urk, no! Later, in the lobby, I found an open wifi network, and watched it automatically sync up.
By the time I was home, the video of cute kids playing weathermen and reporting on our near miss by a tropical storm had been enjoyed by the folks who didn't make that family gathering.
little disasters
Interesting times.. While the big disasters are ongoing, little ones have been spicing up my life lately.
A pleasant week by the beach ended with a tropical storm passing over the beach house. I've never experienced this before, and though Andrea was diminished by passing over land, it was still more wind than I've ever seen. I love wind, and this was thrilling, right on the edge of danger but not quite there. At least, if you have sense to stay out of the water. Leaving the beach, I heard of someone who tried to go surfing that day, and drowned.
The night before last, I was startled to find nearly an inch of water seeping up from underneath the tile floor of the kitchen. Probably it has something to do with the pressure tank pumping system, which was repaired while I was away, and means I actually have indoor running water here. (Overrated.) This saw me scrambling to close every water valve, and out with a flashlight at 2 am closing the cutoff at the 1000 gallon water reservoir before it all drained into the house. While sopping up dozens of gallons of water from the floor at 3 am probably doesn't sound like fun, I found myself going through the motions elatedly.. Because this means I finally am coming to understand the source of the damp that infests the most earth-sheltered corner of this house. It's not condensation. It's bad plumbing!
Then yesterday, I went out to try a dip in the river, stopped by the neighborhood eatery and bait shop, and ended up sitting out on the back deck eating ribs and listening to a band with "possum playboys" in their name (which makes the full name fairly irrelevant), while looking out over the river and the old-timey green metal bridge. Which was unexpected fun, and the kind of thing you have to take in when it happens, but getting stuck in a newly installed hole in my driveway was not. My car was spinning, and I gave up and called it a night.
Here's the thing. I could feel my brain working on this stupid "underpowered car is stuck in a small rut" issue all night long. Same mental pathways activating that chew over bugs and design issues. Got up this morning with a set of plans and contingency plans all ready to go. The first one, of jacking it up and putting something under the tire was stymied; it seems I am missing a jack. But the second, of digging out all around the tire, and then filling in with gravel and cat litter (a tip from some offroading website I blearily surfed last night), and then riding the gas while releasing the bake, worked great.
All of which is to say, bring em on! But I still prefer my disasters in the form of software bugs.
faster dh
With wheezy released, the floodgates are opened on a lot of debhelper
changes that have been piling up. Most of these should be pretty minor, but
I released one yesterday that will affect all users of dh
. Hopefully in a
good way.
I made dh
smarter about selecting which debhelper commands it runs.
It can tell when a package does not use the stuff done by a particular
command, and skips running the command entirely.
So the debian/rules binary
of a package using dh
will now often look like this:
dh binary dh_testroot dh_prep dh_auto_install dh_installdocs dh_installchangelogs dh_perl dh_link dh_compress dh_fixperms dh_installdeb dh_gencontrol dh_md5sums dh_builddeb
Which is pretty close to the optimal hand-crafted debian/rules
file (and just
about as fast, too). But with the benefit that if you later add, say, cron job
files, dh_installcron
will automatically start being run too.
Hopefully this will not result in any behavior changes, other than packages building faster and with less noise. If there is a bug it'll probably be something missing in the specification of when a command needs to be run.
Beyond speed, I hope that this will help to lower the bar to adding new commands to debhelper, and to the default dh sequences. Before, every such new command slowed things down and was annoying. Now more special-purpose commands won't get in the way of packages that don't need them.
The way this works is that debhelper commands can include a "PROMISE"
directive. An example from dh_installexamples
# PROMISE: DH NOOP WITHOUT examples
Mostly this specifies the files in debian/
that are used by the command, and
whose presence triggers the command to run. There is also a syntax to specify
items that can be present in the package build directory to trigger the command
to run.
(Unfortunatly, dh_perl
can't use this. There's no good way to specify
when dh_perl
needs to run, short of doing nearly as much work as dh_perl
would do when run. Oh well.)
Note that third-party dh_
commands can include these directives too, if that
makes sense.
I'm happy how this turned out, but I could be happier about the implementation. The PROMISE directives need to be maintained along with the code of the command. If another config file is added, they obviously must be updated. Other changes to a command can invalidate the PROMISE directive, and cause unexpected bugs.
What would be ideal is to not repeat the inputs of the command in these directives, but instead write the command such that its inputs can be automatically extracted. I played around with some code like this:
$behavior = main_behavior("docs tmp(usr/share/doc/)", sub { my $package=shift; my $docs=shift; my $docdir=shift; install($docs, $docdir); }); $behavior->($package);
But refactoring all debhelper commands to be written in this style would be a big job. And I was not happy enough with the flexability and expressiveness of this to continue with it.
I can however, dream about what this would look like if debhelper were written
in Haskell. Then I would have a Debhelper a
monad, within which each command
executes.
main = runDebhelperIO installDocs installDocs :: Monad a => Debhelper a installDocs = do docs <- configFile "docs" docdir <- tmpDir "usr/share/doc" lift $ install docs docdir
To run the command, runDebhelperIO
would loop over all the packages
and run the action, in the Debhelper IO
monad.
But, this also allows making an examineDebhelper
that takes an action
like installDocs
, and runs it in a Debhelper Writer
monad. That would
accumulate a list of all the inputs used by the action, and return it,
without performing any side effecting IO actions.
It's been 15 years since I last changed the language debhelper was written
in. I did that for less gains than this, really. (The issue back then was
that shell getopt
sucked.) IIRC it was not very hard, and only took a few
days. Still, I don't really anticipate reimplementing debhelper in Haskell
any time soon.
For one thing, individual Haskell binaries are quite large, statically linking all Haskell libraries they use, and so the installed size of debhelper would go up quite a bit. I hope that forthcoming changes will move things toward dynamically linked haskell libraries, and make it more appealing for projects that involve a lot of small commands.
So, just a thought experiment for now..
the #newinwheezy game: STM
Debian wheezy includes a bunch of excellent new Haskell libraries. I'm going to highlight one that should be interesting to non-Haskell developers, who may have struggled with writing non-buggy threaded programs in other languages: libghc-stm-dev
I had given up on most threaded programs before learning about Software Transactional Memory. Writing a correct threaded program, when multiple threads needed to modify the same state, needed careful uses of locking. In my experience, locking is almost never gotten right the first time.
A real life example I encountered is an app that displays a queue of files to be downloaded, and a list of files currently downloading. Starting a new download would go something like this:
startDownload = do file <- getQueuedFile push file currentDownLoads startDownloadThread file
But there's a point in time in which another thread, that refreshes the display, could then see an inconsistent state, where the file is in neither place. To fix this, you'd need to add lock checking around all accesses to the download queue and current downloads list, and lock them both here. (And be sure to always take the locks in the same order!)
But, it's worse than that, because how is getQueuedFile
implemented?
If the queue is empty, it needs to wait on a file being added. But how
can a file be added the queue if we've locked it in order to perform this
larger startDownload
operation? What should be really simple code
has become really complex juggling of locks.
STM deals with this in a much nicer way:
startDownload = atomically $ do file <- getQueuedFile push file currentDownLoads startDownloadThread file
Now the two operations are performed as one atomic transaction. It's not possible for any other thread to see an inconsistent state. No explicit locking is needed.
And, getQueuedFile
can do whatever waiting it needs to, also using STM.
This becomes part of the same larger transaction, in a way that cannot
deadlock. It might be implemented like this:
getQueuedFile = atomically $ if empty downloadQueue then retry else pop downloadQueue
When the queue is empty and this calls "retry", STM automatically waits for the queue to change before restarting the transaction. So this blocks until a file becomes available. It does it without any locking, and without you needing to tell explicitly tell STM what you're waiting on.
I find this beautiful, and am happier with it the more I use it in my code.
Functions like getQueuedFile
that run entirely in STM are building blocks
that can be snapped together without worries to build more and more complex
things.
For non-Haskell developers, STM is also available in Clojure, and work is underway to add it to gcc. There is also Hardware Transactional Memory coming, to speed it up. Although in my experience it's quite acceptably fast already.
However, as far as I know, all these other implementations of STM leave developers with a problem nearly as thorny as the original problem with locking. STM inherently works by detecting when a change is made that conflicts with another transaction, throwing away the change, and retrying. This means that code inside a STM transaction may run more than once.
Wait a second.. Doesn't that mean this code has a problem?
startDownload = atomically $ do file <- getQueuedFile push file currentDownLoads startDownloadThread file
Yes, this code is buggy! If the download thread is started, but then STM restarts the transaction, the same file will be downloaded repeatedly.
The C, Clojure, etc, STM implementations all let you write this buggy code.
Haskell, however, does not. The buggy code I showed won't even compile. The
way it prevents this involves, well, monads. But essentially, it is able to
use type checking to automatically determine that startDownloadThread
is
not safe to put in the middle of a STM transaction. You're left with no
choice but to change things so the thread is only
spawned once the transaction succeeds:
startDownload = do file <- atomically $ do f <- getQueuedFile push file currentDownLoads return f startDownloadThread file
If you appreciate that, you may want to check out some other #newinwheezy stuff like libghc-yesod-dev, a web framework that uses type checking to avoid broken urls, and also makes heavy use of threading, so is a great fit for using with STM. And libghc-quickcheck2-dev, which leverages the type system to automatically test properties about your program.
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!