As mentioned, I need to get the list of XML tags in file, using library xml.etree.ElementTree
.
I am aware that there are properties and methods like ETVar.child, ETVar.getroot(), ETVar.tag, ETVar.attrib
.
But to be able to use them and get at least name of tags on level 2, I had to use nested for.
At the moment I have something like
for xmlChild in xmlRootTag:
if xmlChild.tag:
print(xmlChild.tag)
Goal would be to get a list of ALL, even deeply nested XML tags in file, eliminating duplicates.
For a better idea, I add the possible example of XML code:
<root>
<firstLevel>
<secondlevel level="2">
<thirdlevel>
<fourth>text</fourth>
<fourth2>text</fourth>
</thirdlevel>
</secondlevel>
</firstlevel>
</root>
I've done more of a research on the subject and found out suitable solution. Since this could be a common task to do, I'll answer it, hence I believe it could help others.
What I was looking for was etree method iter.
import xml.etree.ElementTree as ET
# load and parse the file
xmlTree = ET.parse('myXMLFile.xml')
elemList = []
for elem in xmlTree.iter():
elemList.append(elem.tag)
# now I remove duplicities - by convertion to set and back to list
elemList = list(set(elemList))
# Just printing out the result
print(elemList)
xml.etree.ElemTree
is a standard Python libraryPython v3.2.3
set
, which allows only unique values and then converting back to list
.