Older blog entries for movement (starting at number 278)

A horrible little ElementTree gotcha

What does this print:


from lxml import etree
doc = etree.fromstring('<a><b><c/></b></a>')
newdoc = etree.ElementTree(doc.find('b'))
print newdoc.xpath('/b/c')[0].xpath('/a')


The answer is: [<Element a at 817548c>]. The first point to note is that xpath() against an element is only relative to that element: any absolute XPaths enumerate from the top of the containing tree. The second point is that the shallow copying of etree means that _Element::xpath, unlike _ElementTree::xpath, evaluates absolute paths from the top of the original underlying tree! So even though there's no <a> in newdoc, an absolute XPath on a child element can still reach it.
Yuck.

Syndicated 2009-10-20 15:42:00 (Updated 2009-10-20 15:50:34) from John Levon

YouTube annoyance

How much time would it really take to order multi-part videos, so the suggestion at the end of the video is the next part? Please!

Syndicated 2009-10-19 16:29:00 (Updated 2009-10-19 16:29:51) from John Levon

An annoying Python gotcha

Imagine you have this in mod.py:


import foo

class bar(object):
...

def __del__(self):
foo.cleanup(self.myhandle)

Seems fine right? In fact, there's a nasty bug here. If I try to use this module in client.py like so:

import mod
mybar = bar()


Then you're likely to get an exception when the program exits. This is because Python, for some bizarre reason, Nones out the globals in mod.py when taking down the interpreter. The actual __del__ method can be called sometime after this, and it ends up trying None.cleanup(), with the resultant AttributeError. It seems extremely bizarre that it happens in this order, but it does (a real example).

Syndicated 2009-10-10 16:05:00 (Updated 2009-10-10 16:12:43) from John Levon

Kernel solipsism

Thomas Gleixner:


Exactly that's the point. Adding dom0 makes life easier for a group of users who decided to use Xen some time ago, but what Ingo wants is technical improvement of the kernel... The kernel policy always was and still is to accept only those features which have a technical benefit to the code base.


It boggles the mind that someone could get things so backwards. The kernel exists to provide services to the outside world, not the other way around. By all means criticise the details of the Xen dom0 code, but this argument makes zero sense. How precisely did x86_64 support provide a technical benefit to the code base?

Syndicated 2009-06-04 12:11:00 (Updated 2009-06-04 12:18:44) from John Levon

BNP

Charlie Brooker on the BNP party political broadcast:

Nick Griffin's first line is "Don't turn it off!", which in terms of opening gambits is about as enticing as hearing someone shout "Try not to be sick!" immediately prior to intercourse.

Syndicated 2009-05-18 12:24:00 (Updated 2009-05-18 12:25:35) from John Levon

26 Mar 2009 (updated 26 Mar 2009 at 04:08 UTC) »

Outputting XML in standard Python

Is it really this ugly? I expected something like this:


doc = xmldoc()
doc.start('foo', { 'id': 'blah' })
doc.start('sub')
doc.text('subtext')
doc.close('sub')
doc.close('foo')
print doc


and I thought I had it in SimpleXMLWriter. However, I have to jump hoops to get it to output to a string, and it doesn't have any pretty-print. I tried using ElementTree, but that also doesn't pretty print! libxml2 is horribly low-level. lxml seems to do pretty printing, but it's still just as ugly as the best option I've found so far, xml.dom.minidom:


from xml.dom.minidom import Document
foo = doc.createElement('foo')
foo.setAttribute('id', 'blah')
doc.appendChild(foo)
sub = doc.createElement('sub')
sub.appendChild(doc.createTextNode('subtext'))
foo.appendChild(sub)


Yuck! If I'm building up a document, I almost always want to append directly at the last point: why do I have to keep track of all these elements by hand? I presume I'm missing some small standard helper module, but #python didn't know about it. Anyone?

Syndicated 2009-03-26 02:43:00 (Updated 2009-03-26 03:20:31) from John Levon

Scoble sets a new record

I really hate the word “friend.” It has no meaning anymore. No one can define what a friend is. Believe me, I’ve asked dozens of people to define it for me. My wife is my most “true” friend, for instance but if you trust her with picking a great wine (she doesn’t drink much) or picking a great sushi restaurant (she hates the stuff) you’ll be very disappointed. You’d be better off asking @garyvee about the wine even though you’ve never met him and he probably wouldn’t be listed among your “true” friends.

- Scoble

Might I gently suggest friendship isn't about wine recommendations?

Syndicated 2009-03-22 22:55:00 (Updated 2009-03-22 22:58:17) from John Levon

16 Mar 2009 (updated 16 Mar 2009 at 22:10 UTC) »

Sheesh

Apparently applications should be prepared to lose 60 minutes of data to work around the file system now.

Of course the notion that application should have explicit load/save operations is a nonsense already. Now we should "fix" one of the few places that never had this (ever seen a browser where you have to save your bookmarks explicitly when you quit?) to expose this implementation detail in a data-losing way again.

Syndicated 2009-03-16 20:31:00 (Updated 2009-03-16 21:26:42) from John Levon

Amazon

It's a shame that it's basically impossible to compete with Amazon when it comes to online book selling, because their website is so horribly awful to use. Not fair.

Syndicated 2009-03-16 01:25:00 (Updated 2009-03-16 01:27:26) from John Levon

It's not just atol(), Nicholas

Nicholas Nethercote warns us against atol(). Sadly, he recommends using strtol() instead. This interface is almost as bad. If atol() is impossible to get right, strtol() has to be classified under the obvious use is wrong.

As a perfect example of how horrible strtol() is, let's look at his example code:


int i1 = strtol(s, &endptr, 0); if (*endptr != ',') goto bad;
int i2 = strtol(endptr+1, &endptr, 0); if (*endptr != ',') goto bad;
int i3 = strtol(endptr+1, &endptr, 0); if (*endptr != '\0') goto bad;
...
bad: /* error case */

Can you spot the bug? What about an input like ",2,3" ? Nicholas does mention that this code is broken for underflow or overflow (you must wrap every singe call like this: "errno = 0; strtol(...); if (errno...)") but either missed this or considered it irrelevant. It's just too hard to get right.

Just use the *scanf() family (yes, that's hard to use too). Be suspicious of any code using either strtol() or atol().

Syndicated 2009-03-14 12:03:00 (Updated 2009-03-14 12:16:29) from John Levon

269 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!