13 Jan 2010 mbrubeck   » (Journeyer)

Weekend hack: outline grep

I keep almost all of my notes and to-do lists in plain text files, so I can edit and search them with Vim, grep, and other standard Unix tools. I often indent lines in these files to create a simple outline structure, and use the autoindent and foldmethod=indent options to make Vim into a simple outliner.

To get useful output when searching through these outline-structured files, I wrote a simple grep replacement. Given a text file with a Python-style indentation structure, ogrep searches the file for a regular expression. It prints matching lines, with their "parent" lines as context. For example, if input.txt looks like this:

  2009-01-01
  New Year's Day!
    No work today.
    Visit with family.
2009-01-02
  Grocery store and library.
2009-01-03
  Stay home.
2009-01-04
  Back to work.
    Remember to set an alarm.

then ogrep work input.txt will produce the following output:

  2009-01-01
  New Year's Day!
    No work today.
2009-01-04
  Back to work...

You can download ogrep from the outline-grep repository on GitHub, or just read the literate Haskell file. The code is almost trivial (40 lines of code, plus imports and comments); I'm publishing it just in case anyone else has a use for it, and because some of my friends were curious about how I'm using Haskell. I've now written a few "real-world" Haskell programs (compleat was the first). I'm finding Haskell very well suited to such programs, though this particular one would be equally easy in a language like Perl, Python, or Ruby.

This is a one-off tool to fill a gap in my workflow; there are no configuration options or useful error messages. It would be fairly easy to extend it, though. For example, it might be handy to have an option to include children (as well as parents) of matching lines. I recently realized that ogrep often works for searching through source code too, which might generate some more unexpected use cases.

Syndicated 2010-01-12 08:00:00 from Matt Brubeck

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!