XML to CSV Python

Nipun khanna picture Nipun khanna · Apr 18, 2018 · Viewed 9.7k times · Source

The XML data(file.xml) for the state will look like below

<?xml version="1.0" encoding="UTF-8" standalone="true"?>
<Activity_Logs xsi:schemaLocation="http://www.cisco.com/PowerKEYDVB/Auditing 
DailyActivityLog.xsd" To="2018-04-01" From="2018-04-01" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cisco.com/PowerKEYDVB/Auditing">
    <ActivityRecord>
       <time>2015-09-16T04:13:20Z</time>
       <oper>Create_Product</oper>
       <pkgEid>10</pkgEid>
       <pkgName>BBCWRL</pkgName>
       </ActivityRecord>
    <ActivityRecord>
       <time>2015-09-16T04:13:20Z</time>
       <oper>Create_Product</oper>
       <pkgEid>18</pkgEid>
       <pkgName>CNNINT</pkgName>
    </ActivityRecord>

Parsing and conversion to CSV of above mentioned XML file will be done by the following python code.

import csv
import xml.etree.cElementTree as ET


tree =  ET.parse('file.xml')
root = tree.getroot()


data_to_csv= open('output.csv','w')

list_head=[]

Csv_writer=csv.writer(data_to_csv)

count=0
for elements in root.findall('ActivityRecord'):
    List_node = []
    if count == 0 :

        time = elements.find('time').tag
        list_head.append(time)

        oper = elements.find('oper').tag
        list_head.append(oper)

        pkgEid = elements.find('pkgEid').tag
        list_head.append(pkgEid)


        pkgName = elements.find('pkgName').tag
        list_head.append(pkgName)

        Csv_writer.writerow(list_head)
        count = +1

    time = elements.find('time').text
    List_node.append(time)

    oper = elements.find('oper').text
    List_node.append(oper)

    pkgEid = elements.find('pkgEid').text
    List_node.append(pkgEid)

    pkgName = elements.find('pkgName').text
    List_node.append(pkgName)    

    Csv_writer.writerow(List_node)

data_to_csv.close()

The code I am using is not giving me any data in CSV. Could some one tell me where excatly am I going wrong?

Answer

Willian Vieira picture Willian Vieira · Apr 24, 2018

Using Pandas, parsing all xml fields.

import xml.etree.ElementTree as ET
import pandas as pd

tree = ET.parse("file.xml")
root = tree.getroot()

get_range = lambda col: range(len(col))
l = [{r[i].tag:r[i].text for i in get_range(r)} for r in root]

df = pd.DataFrame.from_dict(l)
df.to_csv('file.csv')