I am trying to parse an XML with namespace, The XML looks like
<DATA xmlns="http://example.com/nspace/DATA/1.0" xmlns:UP="http://example.com/nspace/UP/1.1" col_time_us="14245034321452862">
<UP:IN>...</UP:IN>
<UP:ROW>
<sampleField>...</sampleField>
</UP:ROW>
<UP:ROW>
<sampleField>...</sampleField>
</UP:ROW>
.
.
.
</DATA>
When I use the following code to parse the XML
tree=ET.parse(fileToParse);
root=tree.getRoot();
namespaces = {'UP':'http://example.com/nspace/DATA/1.0'}
for data in root.findAll('UP:ROW',namespaces):
hour+=1
I get the following error:
AttributeError: 'Element' object has no attribute 'findAll'
When I try to iterate through the children of root and print the tags, I get {http://example.com/nspace/DATA/1.0}ROW
as the tags instead of just ROWS.
I want to find the ROW elements and extract the value for the sampleField tags. Could anybody please guide me as to what I could be doing wrong?
ElementTree Element
objects indeed have no findAll()
method. The correct method to use is Element.findall()
, all lowercase.
You are also using the wrong namespace URI for the UP
namespace. The root element defines two namespaces, and you need to pick the second one:
<DATA xmlns="http://example.com/nspace/DATA/1.0"
xmlns:UP="http://example.com/nspace/UP/1.1" ...>
Note the xmlns:UP
, so use that URI:
>>> namespaces = {'UP': 'http://example.com/nspace/UP/1.1'}
>>> root.findall('UP:ROW', namespaces)
[<Element {http://example.com/nspace/UP/1.1}ROW at 0x102eea248>, <Element {http://example.com/nspace/UP/1.1}ROW at 0x102eead88>]