6 May 2011 dkg   » (Master)

Please use unambiguous tag names in your DVCS

One of the nice features of a distributed version control system (DVCS) like git is the ability to tag specific states of your project, and to cryptographically sign your tags.

Many projects use simple tag names with the version string like "0.35". This is a plea to ensure that the tags you make explicitly reference the project you're working on. For example, if you are releasing version 0.35 of project Foo, please make your tag "foo_0.35" for security and for future disambiguation.

There is more than one reason to care about unambiguous tags. I'll give two reasons below, but they come from the same fundamental observation: All git repositories are, in some sense, the same git repository; some just store different commits than others.

it's entirely possible to merge two disjoint git repositories, and to have two unassociated threads of development in the same repo. you can also merge threads of developement later if the projects converge.

Avoid tag replay attacks
Let's assume Alice works on two projects, Foo and Bar. She wraps up work on a new version of Foo, and creates and signs a simple tag "0.32". She publishes this tag to the Foo project's public git repo.

Bob is trying to attack the Bar project, which is currently at version 0.31. Bob can actually merge Alice's work on Foo into the Bar repository, including Alice's new tag.

Now looks like there is a tag for version 0.32 of project Bar, and it has been cryptographically-signed by Alice, a known active developer!

If she had named her tag "foo_0.32" (and if all Bar tags were of the form "bar_X.XX"), it would be clear that this tag did not belong to the Bar project.

Be able to merge projects and keep full history
Consider two related projects with separate development histories that later decide to merge (e.g. a library and its most important downstream application). If they merge their git repos, but both projects have a tag "0.1", then one tag must be removed to make room for the other.

If all tags were unambiguously named, the two repos could be merged cleanly without discarding or rewriting history.

I noticed this because of a general principle i try to keep in mind: when making a cryptographic signature, ensure that the thing you are signing is context-independent -- that is, that it cannot be easily misinterpreted when placed in a different context. For example, do not sign e-mails that just say "I approve" -- say specifically what you are approving of in the body of the signed mail. Otherwise, someone could re-send your "I approve" e-mail In-Reply-To a topic that you do not really approve of.

By extension, signing a simple tag is like saying "the source tree in this specific state (and with this specific history) is version 0.3". A close inspection of the state and the history by a sensitive/intelligent human skilled in the art of looking at source code can probably figure out what the project is from its contents. But it's much less ambiguous to say "the source tree in this specific state (and with this specific history) is version 0.3 of project Foo".

Once you start using unambiguous tag names, you make it safe for people to set up simple tools that do automated scans of your repository and can take action when a new signed tag appears. And you respect (and help preserve) the history of any future project that gets merged with yours.

Tags: git, security

Syndicated 2011-05-06 21:05:00 from Weblogs for dkg

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!