Advogato: Blog for movement

A horrible little ElementTree gotcha

What does this print:


from lxml import etree
doc = etree.fromstring('<a><b><c/></b></a>')
newdoc = etree.ElementTree(doc.find('b'))
print newdoc.xpath('/b/c')[0].xpath('/a')

The answer is: [<Element a at 817548c>]. The first point to note is that xpath() against an element is only relative to that element: any absolute XPaths enumerate from the top of the containing tree. The second point is that the shallow copying of etree means that _Element::xpath, unlike _ElementTree::xpath, evaluates absolute paths from the top of the original underlying tree! So even though there's no <a> in newdoc, an absolute XPath on a child element can still reach it.
Yuck.

Syndicated 2009-10-20 15:42:00 (Updated 2009-10-20 15:50:34) from John Levon

20 Oct 2009 movement » (Master)