Updating XML elements and attribute values using Python etree

Nick H picture Nick H · Feb 7, 2012 · Viewed 22.9k times · Source

I'm trying to use Python 2.7's ElementTree library to parse an XML file, then replace specific element attributes with test data, then save this as a unique XML file.

My idea for a solution was to (1) source new data from a CSV file by reading a file to a string, (2) slice the string at certain delimiter marks, (3) append to a list, and then (4) use ElementTree to update/delete/replace the attribute with a specific value from the list.

I've looked in the ElementTree documentation & saw the clear() and remove() functions, but I have no idea of the syntax to use them adequately.

An example of the XML to modify is below - attributes with XXXXX are to be replaced/updated:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>

The intended result will be, for example:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="12345">
        <Pty ID="ABCDE" R="1"/>
    </RptSide>
</TrdCaptRpt>

How do I use the etree commands to change the base XML to update with an item from the list[]?

Answer

jcollado picture jcollado · Feb 7, 2012

For this kind of work, I always recommend BeautifulSoup because it has a really easy to learn API:

from BeautifulSoup import BeautifulStoneSoup as Soup

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

soup = Soup(xml)
rpt_side = soup.trdcaptrpt.rptside
rpt_side['txt1'] = 'Updated'
rpt_side.pty['id'] = 'Updated'

print soup

Example output:

<trdcaptrpt rptid="10000001" transtyp="0">
<rptside side="1" txt1="Updated">
<pty id="Updated" r="1">
</pty></rptside>
</trdcaptrpt>

Edit: With xml.etree.ElementTree you could use the following script:

from xml.etree import ElementTree as etree

xml = """
<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="XXXXX">
        <Pty ID="XXXXX" R="1"/>
    </RptSide>
</TrdCaptRpt>
"""

root = etree.fromstring(xml)
rpt_side = root.find('RptSide')
rpt_side.set('Txt1', 'Updated')
pty = rpt_side.find('Pty')
pty.set('ID', 'Updated')
print etree.tostring(root)

Example output:

<TrdCaptRpt RptID="10000001" TransTyp="0">
    <RptSide Side="1" Txt1="Updated">
        <Pty ID="Updated" R="1" />
    </RptSide>
</TrdCaptRpt>