I have many rows in a database that contains XML and I'm trying to write a Python script to count instances of a particular node attribute.
My tree looks like:
<foo>
<bar>
<type foobar="1"/>
<type foobar="2"/>
</bar>
</foo>
How can I access the attributes "1"
and "2"
in the XML using Python?
I suggest ElementTree
. There are other compatible implementations of the same API, such as lxml
, and cElementTree
in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree
defines.
First build an Element instance root
from the XML, e.g. with the XML function, or by parsing a file with something like:
import xml.etree.ElementTree as ET
root = ET.parse('thefile.xml').getroot()
Or any of the many other ways shown at ElementTree
. Then do something like:
for type_tag in root.findall('bar/type'):
value = type_tag.get('foobar')
print(value)
And similar, usually pretty simple, code patterns.