Python version 2.7: XML ElementTree: How to iterate through certain elements of a child element in order to find a match

Sarah-Ann picture Sarah-Ann · Mar 26, 2013 · Viewed 42.3k times · Source

I'm a programming novice and only rarely use python so please bear with me as I try to explain what I am trying to do :)

I have the following XML:

<?xml version = "1.0" encoding = "utf-8"?>
<Patients>
    <Patient>
               <PatientCharacteristics>
                   <patientCode>3</patientCode>
               </PatientCharacteristics>
               <Visits>
                   <Visit>
                          <DAS>
                               <CRP>14</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>20</SWOL28>
                                       <TEN28>20</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-02-17</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>10</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>15</SWOL28>
                                       <TEN28>20</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-02-10</VisitDate>
                   </Visit>
               </Visits>
    </Patient>
    <Patient>
        <PatientCharacteristics>
                   <patientCode>3</patientCode>
        </PatientCharacteristics>
               <Visits>
                   <Visit>
                          <DAS>
                               <CRP>14</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>34</SWOL28>
                                       <TEN28>0</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-08-17</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>10</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28></SWOL28>
                                       <TEN28>2</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2010-07-10</VisitDate>
                   </Visit>
                   <Visit>
                          <DAS>
                               <CRP>9</CRP>
                               <ESR/>
                               <Joints>
                                       <DAS_PROFILE>28/28</DAS_PROFILE>
                                       <SWOL28>56</SWOL28>
                                       <TEN28>6</TEN28>
                               </Joints>
                          </DAS>
                          <VisitDate>2009-07-10</VisitDate>
                   </Visit>
               </Visits>

    </Patient>
</Patients>

All I want to do here is update certain 'SWOL28' values if they match the patientCode and VisitDate that I have stored in a text file. As I understand, elementtree does not include a parent reference, as if it did, I could just use findall() from the root and work backwards from there. As it stands here is my psuedocode:

  1. For each line in the text file:
  2. Put Visit_Date Patient_Code New_SWOL28 into variables
  3. For each patient element:
  4. If patientCode = Patient_Code
  5. For each Visit element:
  6. If VisitDate = Visit_Date
  7. If SWOL28 element exists for this visit
  8. Update SWOL28 to New_SWOL28

But I am stuck at step number 5. How do I get a list of visits to iterated through? Apologies if this is a very dumb question but I have searched high and low for an answer I assure you! I have stripped down my code to the bare example of the part I need to fix below:

import xml.etree.ElementTree as ET
tree = ET.parse('DB3.xml')
root = tree.getroot()
for child in root: # THIS GETS ME ALL THE PATIENT ATTRIBUTES
    print child.tag 
    for x in child/Visit: # THIS IS WHAT I CANNOT FIND THE CORRECT SYNTAX FOR
        # I WOULD THEN PERFORM STEPS 6, 7 AND 8 HERE

I would be deeply appreciative of any ideas any of you may have on this. I am not a programming natural that's for sure!

Thanks in advance, Sarah

Edit 1:

On the advice of SVK below I tried the following:

import xml.etree.ElementTree as ET
tree = ET.parse('Untitled.xml')
root = tree.getroot()
for child in root:
    print child.tag 
    child.find( "visits" )
    for x in child.iter("visit"):
        print x.tag, x.text

But the only output I get is: Patient Patient and none of the lower tags. Any ideas?

Answer

svk picture svk · Mar 26, 2013

You can iterate over all the "visit" tags directly under an element "element" like this:

for x in element.iter("visit"):

You can find the first direct child of element matching a certain tag with:

element.find( "visits" )

It looks like you will first have to locate the "visits" element, which is the parent of "visit", and then iterate through its "visit" children. Putting those together you'd have something like this:

for patient_element in root:
    print patient_element.tag 
    visits_element = patient_element.find( "visits" )
    for visit_element in visits_element.iter("visit"):
        print visit_element.tag, visit_element.text
        # ... further processing of each visit element here

In general look at the section "Finding interesting elements" in the documentation for xml.etree.ElementTree: http://docs.python.org/2/library/xml.etree.elementtree.html#finding-interesting-elements