Perl script to parse XML using XML::LibXML;

lozwell picture lozwell · May 1, 2012 · Viewed 38k times · Source

I think this is a very simple issue, but I cannot figure it out despite many searches.

I am trying to parse the following XML to print something similar to TAG=VALUE, so that I can write this to a CSV file. The problem is the tags are not always the same for each sample. I cannot seem to figure out how to get the actual tag names. Any help appreciated!!!

XML File -

<Statistics>
  <Stats>
    <Sample>
        <Name>System1</Name>
        <Type>IBM</Type>
        <Memory>2GB</Memory>
        <StartTime>2012-04-26T14:30:01Z</StartTime>
        <EndTime>2012-04-26T14:45:01Z</EndTime>
    </Sample>

    <Sample>
        <Name>System2</Name>
        <Type>Intel</Type>
        <Disks>2</Disks>
        <StartTime>2012-04-26T15:30:01Z</StartTime>
        <EndTime>2012-04-26T15:45:01Z</EndTime>
        <Video>1</Video>
    </Sample>
  </Stats>
</Statistics>

Script -

#!/usr/bin/perl
use XML::LibXML;

$filename = "data.xml";

my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file($filename);

for my $sample ($xmldoc->findnodes('/Statistics/Stats/Sample')) {

print $sample->nodeName(), ": ", $sample->textContent(), "\n";

}

Answer

Grant McLean picture Grant McLean · May 1, 2012

You have the right method for getting the tag names, you just need an extra loop to run through the tags inside each <sample>:

#!/usr/bin/perl

use strict;
use warnings;

use XML::LibXML;

my $filename = "data.xml";

my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file($filename);

for my $sample ($xmldoc->findnodes('/Statistics/Stats/Sample')) {
    for my $property ($sample->findnodes('./*')) {
        print $property->nodeName(), ": ", $property->textContent(), "\n";
    }
    print "\n";
}

Edit: I have now created a tutorial site called Perl XML::LibXML by Example which answers exactly this type of question.