26 Jul 2012 ralsina   » (Master)

Driving a Nail With a Shoe I: Do-Sheet

I had proposed a talk for PyCon Argentina called "Driving 3 Nails with a Shoe". I know, the title is silly, but the idea was showing how to do things using the wrong tool, intentionally. Why? Because:

  1. It makes you think different
  2. It's fun

The bad side is, of course, that this talk's contents have to be a secret, or else the fun is spoiled for everyone. Since the review process for PyConAr talks is public, there was no way to explain what this was about.

And since that means the reviewers basically have to take my word for this being a good thing to have at a conference, which is unfair, I deleted the proposal. The good (maybe) news is that now everyone will see what those ideas I had were about. And here is nail number 1: Writing a spreadsheet using doit.

This is not my first "spreadsheet". It all started a long, long time ago with a famous recipe by Raymond Hettinger which I used again and again and again (I may even be missing some post there).

Since I have been using doit for Nikola I am impressed by the power it gives you. In short, doit lets you create tasks, and those tasks can depend on other tasks, and operate on data, and provide results for other tasks, etc.

See where this is going?

So, here's the code, with explanations:

cells is our spreadsheet. You can put anything there, just always use "cellname=formula" format, and the formula must be valid Python, ok?

from tokenize import generate_tokens

cells = ["A1=A3+A2", "A2=2", "A3=4"]
values = {}

task_calculate creates a task for each cell, called calculate:CELLNAME. The "action" to be performed by that task is evaluating the formula. But in order to do that successfully, we need to know what other cells have to be evaluated first!

This is implemented using doit's calculated dependencies by asking doit to run the task "get_dep:FORMULA" for this cell's formula.

def evaluate(name, formula):
    value = eval(formula, values)
    values[name] = value
    print "%s = %s" % (name, value)

def task_calculate():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name':name,
            'calc_dep': ['get_dep:%s' % formula],
            'actions': [(evaluate, (name, formula))],
            }

For example, in our test sheet, A1 depends on A3 and A2 but those depend on no other cells. To figure this out, I will use the tokenize module, and just remember what things are "names". More sophisticated approaches exist.

The task_get_dep function is a doit task that will create a task called "get_dep:CELLNAME" for every cell name in cells.

What get_dep returns is a list of doit tasks. For our A1 cell, that would be ["calculate:A2", "calculate:A3"] meaning that to calculate A1 you need to perform those tasks first.

def get_dep(formula):
    """Given a formula, return the names of the cells referenced."""
    deps = {}
    try:
        for token in generate_tokens([formula].pop):
            if token[0] == 1:  # A variable
                deps[token[1]] = None
    except IndexError:
        # It's ok
        pass
    return {
        'result_dep': ['calculate:%s' % key for key in deps.keys()]
        }

def task_get_dep():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name': formula,
            'actions': [(get_dep, (formula,))],
            }

And that's it. Let's see it in action. You can get your own copy here and try it out by installing doit, editing cells and then running it like this:

ralsina@perdido:~/dosheet$ doit -v2 calculate:A3
.  get_dep:4
{}
.  calculate:A3
A3 = 4
ralsina@perdido:~/dosheet$ doit -v2 calculate:A2
.  get_dep:2
{}
.  calculate:A2
A2 = 2
ralsina@perdido:~/dosheet$ doit -v2 calculate:A1
.  get_dep:A3+A2
{'A3': None, 'A2': None}
.  get_dep:4
{}
.  calculate:A3
A3 = 4
.  get_dep:2
{}
.  calculate:A2
A2 = 2
.  calculate:A1
A1 = 6

As you can see, it always does the minimum amount of effort to calculate the desired result. If you are so inclined, there are some things that could be improved, and I am leaving as exercise for the reader, for example:

  1. Use uptodate to avoid recalculating dependencies.
  2. Get rid of the global values and use doit's computed values instead.

Here is the full listing, enjoy!


Syndicated 2012-07-25 22:25:22 from Lateral Opinion

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!