Older blog entries for joey (starting at number 494)

I work for The Internet now

I have an interesting problem: How do I shoehorn "hired by The Internet for a full year to work on Free Software" into my resume?

Yes, the git-annex Kickstarter went well. :) I had asked for enough to get by for three months. Multiple people thought I should instead work on it for a full year and really do the idea justice. Jason Scott was especially enthusiastic about this idea. So I added new goals and eventually it got there.

graph: up and to the right, to 800% funded

Don Marti thinks the success of my Kickstarter validates crowdfunding for Free Software. Hard to say; this is not the first Free Software to be funded on Kickstarter. Remember Diaspora?

Here's what I think worked to make this a success:

  • I have a pretty good reach with this blog. I reached my original goal in the first 24 hours, and during that time, probably 75% of contributions were from people I know, people who use git-annex already, or people who probably read this blog. Oh, and these contributors were amazingly generous.

  • I had a small, realistic, easily acheivable goal. This ensured my project was quickly visible in "successful projects" on Kickstarter, and stayed visible. If I had asked for a year up front, I might not have fared as well. It also led to my project being a "Staff Pick" for a week on Kickstarter, which exposed it to a much wider audience. In the end, nearly half my funding came from people who stumbled over the project on Kickstarter.

  • The git-annex assistant is an easy idea to grasp, at varying levels of technical expertise. It can be explained by analogy to DropBox, or as something on top of git, or as an approach to avoid to cloud vendor lockin. Most of my previous work would be much harder to explain to a broad audience in a Kickstarter. But this still appeals to very technical audiences too. I hit a sweet spot here.

  • I'm enhancing software I've already written. This made my Kickstarter a lot less vaporware than some other software projects on Kickstarter. I even had a branch in git where I'd made sure I could pull off the basic idea of tying git-annex and inotify together.

  • I put in a lot of time on the Kickstarter side. My 3 minute video, amuaturish as it is, took 3 solid days work to put together. (And 14 thousand people watched it... eep!) I added new and better rewards, posted fairly frequent updates, answered numerous questions, etc.

  • I managed to have at least some Kickstarter rewards that are connected to the project is relevant ways. This is a hard part of Kickstarter for Free Software; just giving backers a copy of the software is not an incentive for most of them. A credits file mention pulled in a massive 50% of all backers, but they were mostly causual backers. On the other end, 30% of funds came from USB keychains, which will be a high-quality reward and has a real use case with git-annex.

    The surprising, but gratifying part of the rewards was that 30% of funds came from rewards that were essentially "participate in this free software project" -- ie, "help set my goals" and "beta tester". It's cool to see people see real value in participating in Free Software.

  • I was flexible when users asked for more. I only hope I can deliver on the Android port. Its gonna be a real challange. I even eventually agreed to spend a month trying to port it to Windows. (I refused to develop an IOS port even though it could have probably increased my funding; Steve's controlling ghost and I don't get along.)

  • It seemed to help the slope of the graph when I started actually working on the project, while the Kickstarter was still ongoing. I'd reached my original goal, so why not?

I've already put two weeks work into developing the git-annex assistant. I'm blogging about my progress every day on its development blog. The first new feature, a git annex watch command that automatically commits changes to git, will be in the next release of git-annex.

I can already tell this is going to be a year of hard work, difficult problems, and great results. Thank you for helping make it happen.

Syndicated 2012-06-19 16:14:36 from see shy jo

going to DebConf 12

going to Debconf button

Nicaragua here I come! I plan to be at DebCamp for a few days too, taking advantage of some time on internet-better-than-dialup to work on fixing the tasks on the CDs. Also will be available for one-on-one sessions on debhelper, debconf, debian-installer development, pristine-tar, or git-annex. Give me a yell if you'd like to spend some time learning about any of these.

Also, my git-annex Kickstarter ends in 3 days. It has reached heights that will fund me, at a modest rate, for a full year of development!

(I'm also about to start renting my house for the first time. Whee!)

Syndicated 2012-06-08 19:18:57 from see shy jo

a year of olduse.net

My olduse.net exhibit has now finished replaying the first year of historical Usenet posts from 30 years ago, in real time.

That was only twelve thousand messages, probably less than many people get on Facebook in a year.. but if you read along this year, you probably have a much better feel for what early Usenet was like. If you didn't, it's not too early to start, 9 years of Usenet's flowering lie ahead.

I don't know how many people followed along. I read .. not every message, but a large fraction of them. I see 40 or so unique NNTP connections per day, and some seem to stick around and read for quite a while, so I'm guessing there might be a few hundred people following on a weekly basis.

If you're not one of them, and don't read our olduse.net blog, here are a few of the year's highlights:

  • After announcing the exhibit it hit all the big tech sites like Hacker News and Slashdot, and Boing Boing. A quarter million people loaded the front page. For a while every one of those involved a login to my server to run tin. I have some really amusing who listings. Every post available initially was read an average of 100 times, which was probably more than they were read the first go around.
  • Beautiful ascii art usenet maps were posted as the network mushroomed; I collected them all. In a time-delayed collaboration with Mark, I produced a more modern version.
  • We watched sites struggle with the growth of usenet, and followed along as B-news was developed and deployed.
  • In 2011, Dennis Richie passed. We still enjoy his wit and insight on olduse.net.
  • We enjoyed the first posts of things like The Hacker's Dictionary and DEC Wars.
  • Some source code posted to usenet still compiles. I haven't found anything to add to Debian, yet. :)

I will probably be expiring the first year's messages before too long, to have a more uncluttered view. So if you wanted to read them, hurry up!

Syndicated 2012-06-03 18:10:53 from see shy jo

the rocket

The best thing about rockets is when they go up successfully, and don't come back down.


My git-annex Kickstarter reached escape velocity 24 hours after launch, and is now past 200% funded.

This is great news, because it gives me more time to spend hacking on git-annex, and will let me add more features and polish it better. I had set the goal at a minimal amount because I'd have hated not to get funded at all if this hadn't taken off. But the current funding is much more comfortable, and the further up it goes, the more scope the project can have.


At the point shown above, the project had started to be highlighted as popular, based largely on kind and generous readers of this blog and git-annex users who chipped in. For a while, I personally knew around 1/3rd of contributors.

That was enough to get it noticed by Kickstarter staff, which in turn has led to more growth. I love the juxtaposition of geeky tech project with other stuff here.


I have 17 days remaining until the Kickstarter is finished. I really had worried it might take that long to get funded. As it is, I can't wait to see where the rocket's trajectory takes it in that time!

Syndicated 2012-05-26 03:29:40 from see shy jo

kickstarter for git-annex assistant

I'm doing a Kickstarter! The plan is to take git-annex and use it as the foundation to build a DropBox-like application, that can be used without any knowledge of git. The other part of the plan is for me to get through the summer with my finances intact. :)


My Kickstarter page explains in more detail, but the basic idea is to make git-annex watch a directory, with inotify, and automatically add and sync files placed there. Then build a local web app to control and configure it. I have a working prototype of the inotify part already; getting the distributed syncronisation to work well will be a major challenge; and overall it's going to be a challanging but I think very achievable project.

One thing I'm really excited about is the potential of Kickstarter to connect me with a diverse group of potential users, who are interested enough in a distributed and autonomous equivilant to DropBox to fund my project. Because, with luck, their involvement won't stop at giving some money, but will extend to actually using what I'm developing, and giving me feedback on it.

I hope this will widen the sphere of git-annex users, beyond the current very technically inclined crowd. It's awesome to have users doing things like taking git-annex on the transiberian railroad to help manage all their photos, and others using it to store scientific data ... but getting more regular folks using git, even if they don't know about it, is something I'm very interested in.

xkcd-esque drawing

BTW, I was surprised how much I got into doing the video for the Kickstarter. Using xkcd-style drawings used to explain technical concepts is a medium I enjoy working in. My own git-annex is stuffed with 17 gigabytes of video clips -- 99 separate takes! -- that boiled down to a 2 minute video.

Anyway, please consider backing me if you can, and more importantly, do anything you can to help spread the word. Thanks!

Syndicated 2012-05-23 16:17:14 from see shy jo

popcon graphs for tasks

Last year I was able to switch tasksel to using metapackages, instead of the weird non-package task things that had been used before Debian supported Recommends fields well.

An unanticipated result of the new task packages is that I have this nice popcon data available for them, so can get graphs like these.

For new installs of testing, KDE and Xfce are neck and neck. With Gnome being the default, it's hard to say which desktop users really prefer. My feeling is that it's probably nearly evenly split now.

(I installed Xfce on my sister's laptop last week, and anticipate moving all my family to it, rather than Gnome 3.)

The above graph also shows a surprisingly large number of ssh server task installs. In fact, it's the most often manually installed task. Probably many of those are server machines, and so I'm considering having tasksel automatically select ssh on systems where it doesn't automatically select a desktop.

Language data is also available. Taskel uses language tasks internally, without exposing an interface, so this will be almost entirely users who did an install of testing localised to their language.

Interesting data can be teased out of this too. For example there seem more installs in Catalan than Chinese ... and at least 10 Esperanto users. (As with any popcon number, this is a lower bound, to be multiplied by the scaling guesstimate of your choice.)

By the way, I've got a new vanity domain for my blog and wiki: http://joeyh.name/

The old http://kitenet.net/~joey/ will continue to work, like it has since 1997. But the new is easier to type. And it let me move my site to Branchable, at last.

Syndicated 2012-05-12 22:20:08 from see shy jo

moving my email archives and packages to git-annex

I've recently been moving some important data into git-annex, and finding it simplifies things while also increasing my flexability.

email archives

I've kept my email archives in git for years. This works ok, just choose the right file format (compressed mbox) and number of files (one archive per mailbox per month or so) and git can handle this well enough, as email is not really large.

But, email is not really small either. Keeping my email repository checked out on my netbook consumes 2 gigabytes of its 30 gigabyte SSD, half of which is duplication in .git. Also, I have only kept it at 2 gigabytes through careful selection of what classes of mail I archive. That made sense when archival disk was more expensive, but what makes sense these days is to archive everything.

For a while I've wanted to have a "raw" archive, that includes all email I receive. (Even spam.) This protects against various disasters in mail filtering or reading. Setting that up was my impetus for switching my mail archives to git-annex today.

The new system I've settled on is to first copy all incoming mail into a "raw" maildir folder. Then mailfilter sorts it into the folders I sync (with offlineimap) and read. Each day, the "raw" folder is moved into a mbox archive, and that's added to the git annex. Each month, the mail I've read is moved into a monthly archives, and added to the git annex. A simple script does the work.

I counted the number of copies that existed of my mail when it was stored in git, and found 7 copies spread amoung just 3 drives. I decided to slim that back, and configured git-annex to require only 5 copies. But those 5 copies will spread amoung more drives, including several offline archival drives, so it will be more robust overall.

My netbook will have an incomplete checkout of my mail, omitting the "raw" archive. If I need to peek inside a spam folder for a lost mail, I can quickly pull it down; if I need to free up space I can quickly drop older archives. This is the flexability that git-annex fans love. :)

By the way, this also makes it easier to permanantly delete mail, when you really need to (ie, for contractual reasons). Before, I'd have to do a painful git-filter-branch if I needed to get rid of eg, mail for old jobs. Now I can git annex drop --force.

Pro Tip: If you're doing this kind of migration to git-annex, you can save bandwidth by not re-transferring files to machines that already have a copy. I ran this command on my netbook to inject the archives it had in the old repository into the new repository, verifying checksums as it goes:

  cd ~/mail/archive; find -type l -exec git annex reinject ~/mail.old/archive/{} {} \;

debian packages

I'd evolved a complex and fragile chain of personal apt repositories to hold Debian packages I've released. I recently got rid of the mess, which looked like this: dput → local mini-dinstall repo → dput→mini-dinstallrepo on my server →dput` → Debian

The point of all that was that I could "upload" a package locally while offline and batch transfer it later. And I had a local and a public apt repository of just the packages I've uploaded. But these days, packages uploaded to Debian are available nearly immediately, so there's not much reason to do that.

My old system also had a problem: It only kept the most recent single copy of each package. Again, disk is cheap, so I'd rather have archives of everything I have uploaded. Again I switched to git-annex.

My new system is simplicity itself. I release a package by checking it into a "toupload" directory in my git annex repository on my netbook. Items in that directory are dput to Debian and moved to "released". I have various other clones of that repository, which I git annex move packages to periodically to free up SSD space. In the rare cases when I build a package on a server, I check it into the clone on the server, and again rely on git-annex to copy it around.

Now, does anyone know a good way to download a copy of every package you're ever released from archive.debian.org? (Ideally as a list of urls I can feed to git annex addurl.)


My email and Debian packages were the last large files I was not storing in git-annex. Even backups of my backups end up checked into git-annex and archived away.

Now that I'm using git-annex in every place I can, my goal with it is to make it as easy as possible for as many of you to use it as possible, too. I have some inotify tricks up my sleeve that seem promising. Kickstarter may be involved. Watch this space!

Syndicated 2012-04-22 20:11:38 from see shy jo

Stand by the grey stone when the thrush knocks

Today, map in hand, I explored the "long valley, narrower than the great dale in the South where the Gates of the river stood, and walled with lower spurs of the Mountain".

"The dangerous search on the western slopes for the secret door"
"It seemed as if darkness flowed out like a vapour from the hole in the mountain-side" "They spoke low and never called or sang, for danger brooded in every rock."
"It is almost dark so that its vastness can only be dimly guessed, but rising from the near side of the rocky floor there is a great glow. The glow of Smaug!"

Syndicated 2012-04-09 19:38:20 from see shy jo

ls: the missing options

I'm honored and pleased to be the person who gets to complete ls. This project, begun around when I was born, was slow to turn into anything more than a simple for loop over a dirent. It really took off in the mid and late 80's, when Richard Stallman added numerous features, and the growth has been steady ever since. But, a glance at the man page shows that ls has never quite been complete. It fell to me to finish the job, and I have produced several handy patches to this end:

The only obvious lack now is a -z option, which should make output filenames be NULL terminated for consuption by other programs. I think this would be easy to write, but I've been extermely busy IRL (moving lots of furniture) and didn't get to it. Any takers to write it?

Due to the nature of these patches, they conflict with each other. Here's a combined patch suitable to be applied and tested.

diff -ur orig/coreutils-8.13/src/ls.c coreutils-8.13/src/ls.c
--- orig/coreutils-8.13/src/ls.c    2011-07-28 06:38:27.000000000 -0400
+++ coreutils-8.13/src/ls.c 2012-04-01 12:41:56.835106346 -0400
@@ -270,6 +270,7 @@
 static int format_group_width (gid_t g);
 static void print_long_format (const struct fileinfo *f);
 static void print_many_per_line (void);
+static void print_jam (void);
 static size_t print_name_with_quoting (const struct fileinfo *f,
                                        bool symlink_target,
                                        struct obstack *stack,
@@ -382,6 +383,7 @@
    many_per_line for just names, many per line, sorted vertically.
    horizontal for just names, many per line, sorted horizontally.
    with_commas for just names, many per line, separated by commas.
+   jam to fit in the most information possible.
    -l (and other options that imply -l), -1, -C, -x and -m control
    this parameter.  */
@@ -392,7 +394,8 @@
     one_per_line,      /* -1 */
     many_per_line,     /* -C */
     horizontal,            /* -x */
-    with_commas            /* -m */
+    with_commas,       /* -m */
+    jam            /* -j */
 static enum format format;
@@ -630,6 +633,11 @@
 static bool immediate_dirs;
+/* True means when multiple directories are being displayed, combine
+ * their contents as if all in one directory. -e */
+static bool entangle_dirs;
 /* True means that directories are grouped before files. */
 static bool directories_first;
@@ -705,6 +713,10 @@
 static bool format_needs_type;
+/* Answer "yes" to all prompts. */
+static bool yes;
 /* An arbitrary limit on the number of bytes in a printed time stamp.
    This is set to a relatively small value to avoid the need to worry
    about denial-of-service attacks on servers that run "ls" on behalf
@@ -804,6 +816,7 @@
   {"escape", no_argument, NULL, 'b'},
   {"directory", no_argument, NULL, 'd'},
   {"dired", no_argument, NULL, 'D'},
+  {"entangle", no_argument, NULL, 'e'},
   {"full-time", no_argument, NULL, FULL_TIME_OPTION},
   {"group-directories-first", no_argument, NULL,
@@ -849,12 +862,12 @@
 static char const *const format_args[] =
   "verbose", "long", "commas", "horizontal", "across",
-  "vertical", "single-column", NULL
+  "vertical", "single-column", "jam", NULL
 static enum format const format_types[] =
   long_format, long_format, with_commas, horizontal, horizontal,
-  many_per_line, one_per_line
+  many_per_line, one_per_line, jam
 ARGMATCH_VERIFY (format_args, format_types);
@@ -1448,6 +1461,9 @@
       print_dir_name = true;
+  if (entangle_dirs)
+      print_current_files ();
   if (print_with_color)
       int j;
@@ -1559,6 +1575,7 @@
   print_block_size = false;
   indicator_style = none;
   print_inode = false;
+  yes = false;
   dereference = DEREF_UNDEFINED;
   recursive = false;
   immediate_dirs = false;
@@ -1644,7 +1661,7 @@
       int oi = -1;
       int c = getopt_long (argc, argv,
-                           "abcdfghiklmnopqrstuvw:xABCDFGHI:LNQRST:UXZ1",
+                           "abcdefghijklmnopqrstuvw:xyABCDFGHI:LNQRST:UXZ1",
                            long_options, &oi);
       if (c == -1)
@@ -1667,6 +1684,10 @@
           immediate_dirs = true;
+   case 'e':
+          entangle_dirs = true;
+     break;
         case 'f':
           /* Same as enabling -a -U and disabling -l -s.  */
           ignore_mode = IGNORE_MINIMAL;
@@ -1697,6 +1718,10 @@
           print_inode = true;
+   case 'j':
+     format = jam;
+     break;
         case 'k':
           human_output_opts = 0;
           file_output_block_size = output_block_size = 1024;
@@ -1765,6 +1790,10 @@
           format = horizontal;
+   case 'y':
+     yes = true;
+     break;
         case 'A':
           if (ignore_mode == IGNORE_DEFAULT)
             ignore_mode = IGNORE_DOT_AND_DOTDOT;
@@ -2510,7 +2539,7 @@
       DEV_INO_PUSH (dir_stat.st_dev, dir_stat.st_ino);
-  if (recursive || print_dir_name)
+  if ((recursive || print_dir_name) && ! entangle_dirs)
       if (!first)
         DIRED_PUTCHAR ('\n');
@@ -2526,7 +2555,8 @@
   /* Read the directory entries, and insert the subfiles into the `cwd_file'
      table.  */
-  clear_files ();
+  if (! entangle_dirs)
+     clear_files ();
   while (1)
@@ -2615,7 +2645,7 @@
       DIRED_PUTCHAR ('\n');
-  if (cwd_n_used)
+  if (cwd_n_used && ! entangle_dirs)
     print_current_files ();
@@ -3464,6 +3494,10 @@
       print_with_commas ();
+    case jam:
+      print_jam ();
+      break;
     case long_format:
       for (i = 0; i < cwd_n_used; i++)
@@ -4418,6 +4452,24 @@
   putchar ('\n');
+static void
+print_jam (void)
+  size_t filesno;
+  size_t pos = 0;
+  for (filesno = 0; filesno < cwd_n_used; filesno++)
+    {
+      struct fileinfo const *f = sorted_file[filesno];
+      size_t len = length_of_file_name_and_frills (f);
+      print_file_name_and_frills (f, pos);
+      pos += len;
+    }
+  putchar ('\n');
 /* Assuming cursor is at position FROM, indent up to position TO.
    Use a TAB character instead of two or more spaces whenever possible.  */
@@ -4627,11 +4679,13 @@
   -D, --dired                generate output designed for Emacs' dired mode\n\
 "), stdout);
       fputs (_("\
+  -e, --entangle             display multiple directory contents as one\n\
   -f                         do not sort, enable -aU, disable -ls --color\n\
   -F, --classify             append indicator (one of */=>@|) to entries\n\
       --file-type            likewise, except do not append `*'\n\
       --format=WORD          across -x, commas -m, horizontal -x, long -l,\n\
                                single-column -1, verbose -l, vertical -C\n\
+                               jam -j\n\
       --full-time            like -l --time-style=full-iso\n\
 "), stdout);
       fputs (_("\
@@ -4667,6 +4721,8 @@
   -i, --inode                print the index number of each file\n\
   -I, --ignore=PATTERN       do not list implied entries matching shell PATTERN\
+  -j                         jam output together, makes the most of limited\n\
+                             space on modern systems (cell phones, twitter)\n\
   -k                         like --block-size=1K\n\
 "), stdout);
       fputs (_("\
@@ -4733,6 +4789,7 @@
   -w, --width=COLS           assume screen width instead of current value\n\
   -x                         list entries by lines instead of by columns\n\
   -X                         sort alphabetically by entry extension\n\
+  -y                         answer all questions with \"yes\"\n\
   -Z, --context              print any SELinux security context of each file\n\
   -1                         list one file per line\n\
 "), stdout);

It remains to be seen if multi-option enabled coreutils will be accepted into Debian in time for the next release. Due to some disagreements with the coreutils maintainer, the matter has been referred to the Technical Committee (Flattr me)

Traditionally new ls contributors stop once enough options have been added that they can spell their name, in the best traditions of yellow snow. Once ls -richard -stallman worked, I'm sure RMS moved on other other more pressing concerns. The current maintainer, David MacKenzie, was clearly not done yet, since only ls -david -mack worked. But he was being slow to add these last few features, and ls was very deficient in the realm of spelling my name (ls -o -hss .. srsly?), so I took matter into my own hands in the best tradition of free software.

Syndicated 2012-04-01 16:51:11 from see shy jo

podcasts that don't suck

My public radio station is engaged in a most obnoxious spring pledge drive. Good time to listen to podcasts. Here are the ones I'm currently liking.

  • Free As In Freedom: The best informed podcast on software licensing issues, and highly idealistic. What keeps me coming back, though is that Karen and Bradley never quite agree on things, and always end up in some lawyerly minutia culdesac that is somehow interesting to listen to. They once did a whole show about a particular IRS tax form, and I listened to it all. (Granted, I often listen to this while cleaning house, but as Bradley would say, at least I'm not listening to it while driving.)

  • This Developer's Life: At least the early episodes before it got popular are a unashamed imitation of This American Life, and I have quite enjoyed them. Although I often roll my eyes at the proprietary developer mindsets on display in the show. For example, often they'll have a bug and not root cause it, because well, they don't have the source code for the Windows layers. Still, beneath that it's mostly about the parts of software development that are common to all our lives. A particular episode I can recommend is #10 "Disconnecting" -- the first 20 minutes is a perfect story.

  • Off the Hook: This is actually a live radio show, quite well done, with call-ins and everything. So much more polished than your typical podcast. It's hosted by Emmanuel Goldstein! And it's been going on for over 20 years, so why did I never hear about it before? Probably I'm not quite in the right hacker circles. Since it's out of NYC and very anti-authoritarian, I've mostly been enjoying it as a view into the Occupy protests.

  • StarShipSofa: The best science fiction podcast around. Probably not news to anyone who ever looked for such a podcast. Long, and tends to be frontloaded with a lot of administrivia, which I fast-forward to get to the stories.

  • Spider on the Web: The best music and science fiction podcast around. Mostly on hiatus since Jeanne died, but I hope Spider picks it back up. A good examplar is "Bianca's Hands"

  • Long Now Seminars: Consistently interesting. I visited their space last time I was in SF only to learn they'd had a talk the night before, which would have been a bummer, except they ran the bits of the Clock for us.

  • Linux Outlaws: After 18 years using Linux, I find the level of discourse in most Linux podcasts typically rather annoying. Including this one, but when Fab gets on a rant, it's all worth it. Sometimes some interesting guests.

  • This Week In Debian: Sadly no new episodes lately, and I've been too lame to respond to repeated interview requests. Probably it needs to move away from being an interview show if it is to continue; there are only so many DD's who can give excellent interviews like liw did.

Syndicated 2012-03-30 21:28:22 from see shy jo

485 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!