I retrieve an XML documents this way:
import xml.etree.ElementTree as ET
root = ET.parse(urllib2.urlopen(url))
for child in root.findall("item"):
a1 = child[0].text # ok
a2 = child[1].text # ok
a3 = child[2].text # ok
a4 = child[3].text # BOOM
# ...
The XML looks like this:
<item>
<a1>value1</a1>
<a2>value2</a2>
<a3>value3</a3>
<a4>
<a11>value222</a11>
<a22>value22</a22>
</a4>
</item>
How do I check if a4
(in this particular case, but it might've been any other element) has children?
You could try the list
function on the element:
>>> xml = """<item>
<a1>value1</a1>
<a2>value2</a2>
<a3>value3</a3>
<a4>
<a11>value222</a11>
<a22>value22</a22>
</a4>
</item>"""
>>> root = ET.fromstring(xml)
>>> list(root[0])
[]
>>> list(root[3])
[<Element 'a11' at 0x2321e10>, <Element 'a22' at 0x2321e48>]
>>> len(list(root[3]))
2
>>> print "has children" if len(list(root[3])) else "no child"
has children
>>> print "has children" if len(list(root[2])) else "no child"
no child
>>> # Or simpler, without a call to list within len, it also works:
>>> print "has children" if len(root[3]) else "no child"
has children
I modified your sample because the findall
function call on the item
root did not work (as findall
will search for direct descendants, and not the current element). If you want to access text of the subchildren afterward in your working program, you could do:
for child in root.findall("item"):
# if there are children, get their text content as well.
if len(child):
for subchild in child:
subchild.text
# else just get the current child text.
else:
child.text
This would be a good fit for a recursive though.