Bruno9779
Bruno9779

Reputation: 1669

Perl parsing Xml

I am trying to parse some values from a XML string that I have imported through SOAP:Lite. I am not proficient in perl and I couldn't solve this by googling nor by reading related issues here.

The XML I am trying to parse is this:

<COMPUTER>
  <ACCOUNTINFO>
    <ENTRY Name="fields_12"></ENTRY>
    <ENTRY Name="fields_24"></ENTRY>
    <ENTRY Name="TAG">Arias 4431</ENTRY>
    <ENTRY Name="fields_23"></ENTRY>
    <ENTRY Name="fields_16"></ENTRY>
    <ENTRY Name="fields_11"></ENTRY>
    <ENTRY Name="fields_6"></ENTRY>
    <ENTRY Name="fields_22"></ENTRY>
    <ENTRY Name="fields_20"></ENTRY>
    <ENTRY Name="fields_21"></ENTRY>
    <ENTRY Name="fields_17"></ENTRY>
    <ENTRY Name="fields_3">Vigilancia</ENTRY>
    <ENTRY Name="fields_14"></ENTRY>
    <ENTRY Name="fields_9"></ENTRY>
    <ENTRY Name="fields_8"></ENTRY>
    <ENTRY Name="fields_19"></ENTRY>
    <ENTRY Name="fields_10"></ENTRY>
    <ENTRY Name="fields_28"></ENTRY>
    <ENTRY Name="fields_26"></ENTRY>
    <ENTRY Name="fields_15"></ENTRY>
    <ENTRY Name="fields_5"></ENTRY>
    <ENTRY Name="fields_27"></ENTRY>
    <ENTRY Name="fields_7"></ENTRY>
    <ENTRY Name="fields_25"></ENTRY>
    <ENTRY Name="fields_13"></ENTRY>
    <ENTRY Name="fields_18"></ENTRY>
    <ENTRY Name="fields_4">Vigilancia</ENTRY>
  </ACCOUNTINFO>
  <DICO_SOFT>
    <WORD>Freeware</WORD>
    <WORD>AXIS Video</WORD>
    <WORD>Java</WORD>
    <WORD>Drivers & OEM software</WORD>
    <WORD>Actualizaciones de Microsoft</WORD>
    <WORD>MS Operating Systems</WORD>
  </DICO_SOFT>
  <HARDWARE>
    <CHECKSUM>131071</CHECKSUM>
    <DEFAULTGATEWAY></DEFAULTGATEWAY>
    <DESCRIPTION></DESCRIPTION>
    <DNS></DNS>
    <FIDELITY>9214</FIDELITY>
    <ID>453</ID>
    <IPADDR>10.30.8.214</IPADDR>
    <IPSRC>10.30.8.214</IPSRC>
    <LASTCOME>2013-12-27 12:47:42</LASTCOME>
    <LASTDATE>2013-12-27 12:47:42</LASTDATE>
    <MEMORY>2048</MEMORY>
    <NAME>EC214001</NAME>
    <OSCOMMENTS>Service Pack 1</OSCOMMENTS>
    <OSNAME>Microsoft Windows 7 Professional</OSNAME>
    <OSVERSION>6.1.7601</OSVERSION>
    <PROCESSORN>1</PROCESSORN>
    <PROCESSORS>2399</PROCESSORS>
    <PROCESSORT>Intel(R) Core(TM) i3 CPU M 370 @ 2.40GHz [2 core(s) x64]</PROCESSORT>
    <QUALITY>0.0716</QUALITY>
    <SSTATE>0</SSTATE>
    <SWAP>3868</SWAP>
    <TYPE>0</TYPE>
    <USERAGENT>OCS-NG_WINDOWS_AGENT_v2.0.5.0</USERAGENT>
    <USERDOMAIN></USERDOMAIN>
    <USERID>vigilancia</USERID>
    <WINCOMPANY></WINCOMPANY>
    <WINOWNER>vigilancia</WINOWNER>
    <WINPRODID>55041-033-4366171-86806</WINPRODID>
    <WINPRODKEY>BBBBB-BBBBB-BBBBB-BBBBB-BBBBB</WINPRODKEY>
    <WORKGROUP>MASSONE</WORKGROUP>
  </HARDWARE>
</COMPUTER>

I need to get several values from this for manipulation. For example, ENTRY Name="fields_3, ID, IPADDR and NAME.

I have tried with several modules, such as XML::Simple, XML:Parser etc. The data is contained in a string $refx.

I have tried (between other things):

use XML::Simple;
my $newid = XMLin($refx);
$tes = $newid{'ID'};
print $tes;

But it does not work.

Thanks in advance

Upvotes: 0

Views: 1172

Answers (2)

quicoju
quicoju

Reputation: 1711

You can try using Data::Dumper to see the data structure created after your XML file, use the following code:

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;

my $ref = XMLin("t.xml");
print Dumper($ref);

$ref will hold a reference to the data structure that you want to access. For more information about using XML::Simple check the this link: https://metacpan.org/pod/XML::Simple it has examples and explains in detail this module.

Upvotes: 0

Birei
Birei

Reputation: 36282

Your xml is not well-formed because of this element:

<WORD>Drivers & OEM software</WORD>

The ampersand must be escaped.

Fixed that. You can extract the information with the XML::Twig module and its twig_roots handler that uses xpath expressions to select those nodes:

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

XML::Twig->new(
    'twig_roots' => {
        '/COMPUTER/ACCOUNTINFO/ENTRY[@Name="fields_3"]' => sub { 
            printf qq|%s\n|, $_->text_only 
        },
        '/COMPUTER/HARDWARE' => sub {
            for my $t ( grep { $_->tag =~ m/^(?:ID|IPADDR|NAME)$/ } $_->children ) { 
                printf qq|%s\n|, $t->text_only;
            }   
        },  
    },  
)->parsefile( shift );

You can run it like:

perl script.pl xmlfile

That yields:

Vigilancia
453
10.30.8.214
EC214001

Upvotes: 4

Related Questions