Older blog entries for johnw (starting at number 52)

Building a better pre-commit hook for Git

Recently a friend turned me onto an interesting article about a problem I had just recently discovered about Git and its pre-commit hook:

Committing in git with only some changes added to the staging area still results in an “atomic” revision that may never have existed as a working copy and may not work.

As an example of this, I often find myself doing a whole flurry of changes all at once. This is no problem with Git, because I have the wonderful tool magit.el to help sift out the many commits implied by those changes. So I turn one big set of changes into many smaller commits, leaving only pending work in my working tree. Then I push.

My users pull those commits, only to find that lo! and behold, they will not build. “What?”, I think to myself, “how can that be? I just ran the unit tests and everything was fine.” However, I never ran the unit tests against that particular commit. Because those commits I just pushed never existed as independent working trees on my system. In fact, they never existed at all, they were mere figments within the Git index, which Git happily made into immutable commits for me.

What makes this all worse is that the pre-commit hook was something of a lie. I thought that by adding make check to my pre-commit hook, I’d know for sure that every commit I checked in was safe and sound. However, that make check was running against my working tree, not the proposed commit. I still knew nothing about the correctness of what I just checked in, unless it happened to also represent the state of my working tree – a rare occurrence indeed, given Git’s culture’s preference for frequent, smaller commits.

The answer turned out to be a little complex. What I needed was a pre-commit hook that would test the contents of my Git index before each commit, not my working tree. And there happens to be no simple command in Git for “checking out your index”. Even if you do use git checkout-index, it resets the timestamps for every files that it creates, forcing make check to rebuild the entire app each time – not just its most recent changes. Assuming you have a Makefile system that works, such duplication of effort is wholly unnecessary.

I came up with a solution that uses a secondary source tree, to hold the checked out index, and a temporary build tree, which gets updated with any changes since the last time the pre-commit hook was run. The end result is that small changes pass or fail quickly, while large-scale changes sometimes require a full rebuild to confirm.

The script itself can be viewed in my git-script project on GitHub. You will need to tailor it for your own project if you plan to use it, and then copy it to .git/hooks/pre-commit, and enable the executable bit.

Syndicated 2009-02-14 03:23:00 (Updated 2009-02-14 07:36:09) from Lost in Technopolis

Simple class for walking char arrays as istreams

This has probably been written countless times before, but I found myself needing it today and it was quick to write. It lets you read characters from a char array in C++ via the istream interface:

  #include "ptrstream.h"

int main() {
  ptristream in("Hello, world!\n");
  char buf[31];
  in.getline(buf, 32);
  std::cout << buf << std::endl;
}

Handy for if you don’t want std::istringstream needlessly copying character strings.

Below is the implementation, which is in the public domain:

  #include <istream>
#include <streambuf>
#include <stdlib.h>

class ptristream : public std::istream
{
  class ptrinbuf : public std::streambuf
  {
  protected:
    char *      ptr;
    std::size_t len;

  public:
    ptrinbuf(char * _ptr, std::size_t _len) : ptr(_ptr), len(_len) {
      assert(ptr);
      if (*ptr && len == 0)
        len = std::strlen(ptr);

      setg(ptr,               // beginning of putback area
           ptr,               // read position
           ptr+len);          // end position
    }

  protected:
    virtual int_type underflow() {
      // is read position before end of buffer?
      if (gptr() < egptr())
        return traits_type::to_int_type(*gptr());
      else
        return EOF;
    }

    virtual pos_type seekoff(off_type off, ios_base::seekdir way,
                             ios_base::openmode mode =
                             ios_base::in | ios_base::out)
    {
      switch (way) {
      case std::ios::cur:
        setg(ptr, gptr()+off, ptr+len);
        break;
      case std::ios::beg:
        setg(ptr, ptr+off, ptr+len);
        break;
      case std::ios::end:
        setg(ptr, egptr()+off, ptr+len);
        break;

      default:
        assert(false);
        break;
      }
      return pos_type(gptr() - ptr);
    }
  };

protected:
  ptrinbuf buf;
public: 
 ptristream(char * ptr, std::size_t len = 0)
    : std::istream(0), buf(ptr, len) {
    rdbuf(&buf);
  }
};

Syndicated 2009-02-04 05:46:12 (Updated 2009-02-04 07:53:28) from Lost in Technopolis

Ready Lisp version 20090125 now available

There is a new version of Ready Lisp for Mac OS X available. This version is based on SBCL 1.0.24, and requires OS X Leopard 10.5. The only changes in this version are upgrades of many of the dependent packages.

What is Ready Lisp? It’s a binding together of several popular Lisp packages for OS X, including: Aquamacs, SBCL and SLIME. Once downloaded, you’ll have a single application bundle which you can double-click — and find yourself in a fully configured Common Lisp REPL. It’s ideal for OS X users who want to try out Lisp with a minimum of hassle. The download is approximately 124 megabytes.

There is a GnuPG signature for this file in the same directory; append .asc to the above filename to download it. To install my public key onto your keyring, use this command:

  $ gpg --keyserver pgp.mit.edu --recv 0x824715A0

Once installed, you can verify the download using the following command:

  $ gpg --verify ReadyLisp.dmg.asc

For more information, see the Ready Lisp project page.

Syndicated 2009-01-26 03:52:26 (Updated 2009-01-26 04:12:24) from Lost in Technopolis

Unicode support on the cheap

I’d been avoiding adding full Unicode support to Ledger for some time, since both times I tried it ended up in a veritable spaghetti of changes throughout the code, which it seemed would take forever to “prove”. One branch I started used libICU to handle Unicode strings throughout, while an earlier attempted using regular wide-string support in C++. Both were left on the cutting floor.

But last night an idea struck me: Ledger doesn’t care if UTF8 encoded data is passed around. If a user has Cyrillic characters in their data file, and Ledger leaves its encoding alone, then when those same bytes are printed out the user will see exactly what they input. In this case, the best approach is “hands off”. Just pass the user’s data through transparently, and they will see in their output exactly what they input.

Where this fails is when Ledger tries to output elided columnar data, such in the register report. The problem is, there is no way to know the length of a string without determining exactly how many code-points exist in that UTF8 string. And without knowing the length, it’s impossible to get columns to line up, or to know exactly where a string can be cut in two without breaking a multibyte UTF8 character apart.

Anyway, I discovered a cheap solution today which did the job: Convert strings from UTF8 to UTF32 only when individual character lengths matter, and convert them back after that work is done. This took about one hour to implement, but now Ledger is able to justify columns correctly, even when other alphabets are used! It still doesn’t work for right-to-left alphabets, though.

Syndicated 2009-01-23 23:57:54 (Updated 2009-01-23 23:58:10) from Lost in Technopolis

The feature I avoided for half a year

The other day I finally implemented a feature in Ledger which I’d avoided doing for a full half-year. The reason? Every time I thought about it, my brain kept shutting down. It seems my brain doesn’t care for math much, or for mathy problems, so it always seemed as if something better needed doing…

The problem turned out to be a fairly straightforward one, it just required sitting down and mapping it out for a couple of hours before the coding began. Here’s the synopsis:

You have a network of N nodes, each of which can be connected to N-1 other nodes. There can be multiple connections between any two nodes, where each connection has a date — but no two connections between the same nodes can have the same date.

Given a start node, a query date, and a set of target nodes (which may be zero, one or many), find the shortest and youngest path that is not older than the query date, from the start node to each of the target nodes.

Ledger uses this algorithm to record price conversions between commodities, and to later render each commodity into a market value relative to another known commodity. Sometimes such renderings are not possible, or sometimes they require multiple conversion steps before a value can be found.

For example, if I bought 10 shares of AAPL for $30.00, and later exchanged $10.00 for 9.83 CAD, and at one point exchanged 80 EUR for 100 CAD, then how many EUR are my shares of AAPL worth?

Previously Ledger could only render AAPL in terms of dollars, but now it can finally report any commodity in terms of any other, provided there exists a path of traversal between the two nodes which is older than or equal to the query date.

Syndicated 2009-01-22 00:15:19 (Updated 2009-01-22 00:15:33) from Lost in Technopolis

Moving to Movable Type

The blog has now fully moved over to Movable Type, including all past articles and their comments. It took a bit of Perl, Python and mucking with SQL, but now the transfer is complete.

The reason for the move is that the app I was using, RapidWeaver, was beginning to introduce a bit too much inertia to the blogging process. And one thing I know about myself: if something isn’t dead simple, even after months of being away from it, I’ll avoid it forever.

I write these blog posts using ecto now, which couldn’t be easier. There’s no separate publishing step, it’s like writing and sending an e-mail.

I actually liked the way WordPress looks a bit more, but Movable Type supports PostgreSQL, which is what ever other service on this server uses. And for some reason MT’s XML-RPC script doesn’t work with FastCGI and Apache, which is something I guess I can live with.

Syndicated 2009-01-21 00:22:43 (Updated 2009-01-21 00:22:50) from Lost in Technopolis

A day for nostalgia

After tracking it down on a public domain mirror, and installing an emulator on my MacBook Pro, I was able to download and run the first full computer program I ever wrote: “Sector Inspector” for the Apple //e.

I wrote this program in 1989, and took eleven months to write it (seven to code, four to debug). At the time, it was one of the more complete disk editing utilities I’d seen.

It was released as Shareware (for $20), and I made a total of $60 over the course of eight years. This is the experience that turned me to freeware, actually; because I realized that coding for possible, yet unrealized profit was an unlikely aim. It’s better to know that little will come of it ahead of time, which makes it all about the coding.

Sector Inspector was written using the Merlin Prodos Assembler. It took thirteen minutes to assemble on my Apple //e, four using a friend’s hardware accelerator card. In those days I owned a 1Mb expansion card, and would do all of my development there (for the sake of speed), frequently saving to 5.25” floppies.

When finished Sector Inspector printed to 255 pages of assembly code, which was registered with the US copyright office. I tried selling it to three different software companies at the time, but only responded positively — the authors of Merlin, who said they couldn’t publish another title, but offered me a job instead. I didn’t take the job (I don’t know why), and instead released the program as Shareware.

I remember having dreams that IS would make around $10k, and with that money I would buy a color Macintosh IIfx, all the rage at the time. Those dreams never materialized, of course, and shortly afterward the //e was cancelled, Prodos 8 was cancelled, and I started working on UNIX machines. The next year I worked for Network Solutions and used that money to buy a NeXT workstation, and said goodbye to the Apple world for a very long time (until just a few months ago, when I bought a PowerBook G4).

Here is a screenshot of the splash screen on startup:

screenshot of the startup screen

And a screenshot of the main window:

screenshot of the main window

You can download an Apple //e disk image of Sector Inspector, and play around with it, or read the FEATURES document I wrote as a young 17 year old programmer.

Syndicated 2009-01-20 03:36:57 (Updated 2009-01-20 07:39:30) from Lost in Technopolis

Linux DHCP and Windows DNS

I feel a need to blog about this today because it took several days to figure out, but the solution was trivial.

The scenario: my company has a Windows 2003 Domain Controller running DHCP, DNS and Active Directory services. We use an Untangle box as our gateway to the Internet. All of this works just great for Windows machines on the network, where everyone can use names like “host” to refer to each other’s machines.

However, the Linux boxes until now have been second-class citizens. They are able to get IP address via DHCP, but Windows knows nothing about their hostnames. Nearly all of our Linux boxen run CentOS 5, which is to say, Redhat 5 (RHEL).

The solution, it turns out, is two-fold:

  1. On the Windows Domain Controller, go to the admin page for “Active Directory Users and Computers”. Under “Users” for your domain, create a new user named something like “dhcp4dns”. Pick a random password.

  2. On the same machine, go to the admin tool for DHCP and right-click on your domain and select Properties. Click on the DNS tab and check everything, while also selecting “Always dynamically update DNS A and PTR records”. Then click on the Advanced tab and its Credentials… button. Here enter the details for the user you created in step 1.

  3. This step is for CentOS: For every Linux box, edit the file /etc/sysconfig/networking/devices/ifcfg-eth0 (or whichever interface faces your local network).

Add the following line to that file, replacing hostname with your unqualified hostname:

  DHCP_HOSTNAME=_hostname_

Now just reset networking on the Linux box:

  # ifdown eth0 ; ifup eth0

Voila, your Windows server should now see the Linux box’s name just like everyone else on the network.

Syndicated 2009-01-16 19:42:45 (Updated 2009-01-18 01:27:13) from Lost in Technopolis

Ledger 2.6.1 released

Ledger 2.6.1 is released today. This is a bug fix release only, which fixes some blocking issues relating to the -p and -e options. It is a recommended upgrade for all Ledger users. It may be downloaded here

Work now turns fully to the upcoming 3.0 release, which represents a substantial code cleanup and rationalization of the user interface and several internal facilities. It will also focus on Python integration and better value expression handling.

Syndicated 2008-09-17 11:31:26 (Updated 2009-01-17 20:58:35) from Lost in Technopolis

Ledger 2.6.1 released

Ledger 2.6.1 is released today. This is a bug fix release only, which fixes some blocking issues relating to the -p and -e options. It is a recommended upgrade for all Ledger users. It may be downloaded here

Work now turns fully to the upcoming 3.0 release, which represents a substantial code cleanup and rationalization of the user interface and several internal facilities. It will also focus on Python integration and better value expression handling.

Syndicated 2008-09-17 11:31:22 from johnw@newartisans.com

43 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!