I want to split a XML file into multiple files. My workstation is very limited to Eclipse Mars with Xalan 2.7.1.
I can also use Python, but never used it before.
<?xml version="1.0" encoding="UTF-8"?>
<root>
<row>
<NAME>Doe</NAME>
<FIRSTNAME>Jon</FIRSTNAME>
<GENDER>M</GENDER>
</row>
<row>
<NAME>Mustermann</NAME>
<FIRSTNAME>Max</FIRSTNAME>
<GENDER>M</GENDER>
</row>
</root>
How can I transform them to look like this
<?xml version="1.0" encoding="UTF-8"?>
<root>
<row>
<NAME>Doe</NAME>
<FIRSTNAME>Jon</FIRSTNAME>
<GENDER>M</GENDER>
</row>
</root>
I need every "row"-data in a single file with header. The data above is just an example. Most of the "row"-data has 16 attributes, but it varies from time to time.
Use Python ElementTree.
Create a file e.g. xmlsplitter.py. Add the code below (where file.xml is your xml file and assuming every row has a unique NAME element.).
import xml.etree.ElementTree as ET
context = ET.iterparse('file.xml', events=('end', ))
for event, elem in context:
if elem.tag == 'row':
title = elem.find('NAME').text
filename = format(title + ".xml")
with open(filename, 'wb') as f:
f.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n")
f.write(ET.tostring(elem))
Run it with
python xmlsplitter.py
Or if the names are not unique:
import xml.etree.ElementTree as ET
context = ET.iterparse('file.xml', events=('end', ))
index = 0
for event, elem in context:
if elem.tag == 'row':
index += 1
filename = format(str(index) + ".xml")
with open(filename, 'wb') as f:
f.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n")
f.write(ET.tostring(elem))