I'm trying to parse a XML
-like file with the following structure:
Edit: I tried to omit most of the huge xml file to simplify everything but c/p-ed wrongly. Here's the full file (900kb!) that actually has this issue: https://docs.google.com/file/d/0B3ustNI1qZh1UURrYWZJQk0wVlU/edit?usp=sharing
<CIM CIMVERSION="2.0" DTDVERSION="2.0">
<DECLARATION>
<DECLGROUP>
<LOCALNAMESPACEPATH>
<NAMESPACE NAME="signalingsystem"/>
</LOCALNAMESPACEPATH>
<VALUE.OBJECT>
<INSTANCE CLASSNAME="SharedGtTranslator">
<PROPERTY NAME="Name" TYPE="string">
<VALUE>AUC$4,1,6,4,26202*-->AUC RemoteSPC: 300 SSN: 10</VALUE>
</PROPERTY>
<PROPERTY NAME="NatureOfAddress" TYPE="sint32">
<VALUE>4</VALUE>
</PROPERTY>
</INSTANCE>
</VALUE.OBJECT>
<VALUE.OBJECT>
<INSTANCE CLASSNAME="SharedGtTranslator">
<PROPERTY NAME="Name" TYPE="string">
<VALUE>AUC$4,2,6,4,26202*-->AUC AUC LocalSPC: 410 SSN: 10</VALUE>
</PROPERTY>
<PROPERTY NAME="NatureOfAddress" TYPE="sint32">
<VALUE>4</VALUE>
</PROPERTY>
<VALUE>2</VALUE>
</PROPERTY>
</INSTANCE>
</VALUE.OBJECT>
</DECLGROUP>
</DECLARATION>
</CIM>
I'm using XMLSimple to parse that structure.
I need to get all the Values for the PROPERTY NAME="Name"
if CLASSNAME="SharedGtTranslator"
.
This is what I'm trying to do:
#!/usr/bin/perl
use strict;
use warnings;
# use module
use XML::Simple;
use Data::Dumper;
my $file1 = $ARGV[0];
# create object
my $xml = new XML::Simple;
# read XML file
my $data = $xml->XMLin($file1);
foreach my $object (@{$data->{DECLARATION}->{DECLGROUP}->{'VALUE.OBJECT'}}) {
if ($object->{INSTANCE}->{CLASSNAME} eq 'SharedGtTranslator') {
foreach my $property (@{$object->{INSTANCE}->{PROPERTY}}) {
if ($property->{NAME} eq 'Name') {
print $property->{VALUE} . "\n";
}
}
}
}
Getting
"Pseudo-hashes are deprecated"
and nothing happens.
Help is highly appreciated!
Your code works fine for me as it stands. Is that the full program? There is no use of pseudo-hashes in that code.
The only problem I can see is that your XML data isn't well-formed. There is a spurious
<VALUE>2</VALUE>
</PROPERTY>
at the end of the last INSTANCE
element. Once this is fixed your program runs fine.
XML::Simple
seems to be working for you, so it's probably appropriate to stick with it. But I don't generally recommend that people use this module. It can be far from simple to get working, and the structure it builds doesn't fully reflect the XML data, so something like
XML::Twig
or
XML::LibXML
is often much better.
Update
Working with your real data, the structure generated by XML::Simple
looks quite unlike what is generated for the short example. There are arrays intermingled with the hashes that weren't there before.
This program seems to generate what you need. It produces 170 lines of output.
use strict;
use warnings;
use XML::Simple;
my $file1 = 'active_7v19.om.cim';
my $xml = new XML::Simple;
my $data = $xml->XMLin($file1);
for my $declgroup (@{ $data->{DECLARATION}{DECLGROUP} }) {
foreach my $object (@{ $declgroup->{'VALUE.OBJECT'} }) {
my $instance = $object->{INSTANCE};
my $classname = $instance->{CLASSNAME};
my $properties = $instance->{PROPERTY};
next unless $classname eq 'SharedGtTranslator';
for my $property (@$properties) {
my $name = $property->{NAME};
my $value = $property->{VALUE};
print $value, "\n" if $name eq 'Name';
}
}
}
However, I am more sure now that you would be better off with a "real" XML library. THis code uses XML::LibXML
to produce the same output.
use strict;
use warnings;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(location => $file1, no_blanks => 1);
my @properties = $doc->findnodes('//INSTANCE[@CLASSNAME = "SharedGtTranslator"]/PROPERTY[@NAME = "Name"]');
for my $property (@properties) {
print $property->textContent('VALUE'), "\n";
}
All the work is done by the XPath expression, which selects all PROPERTY
elements with a NAME
attribute of Name
that are children of an INSTANCE
element anywhere in the document that has a CLASSNAME
attribute of SharedGtTranslator
. The subsequent for
loop prints the value of the VALUE
element within each PROPERTY
. It is clearly a lot more concise, and it is also faster to run, and more flexible if you need to extract different information.