Divox
Divox

Reputation: 101

how can I parse an XML tag that has an attribute, using perl

Below is a XML file,

   <?xml version='1.0'?>
    <employee>
      <name>Pradeep</name>
      <age>23</age>
      <sex>M</sex>
      <department>Coder</department>
    </employee>

And the perl code is

 use XML::Simple;
 use Data::Dumper;
 @xml=new XML::Simple;
 $data=@xml->XMLin("data.xml");
 print Dumper($data);

Now how do you parse if the XML file is

   <?xml version='1.0'?>
      <employee="risc_31">
       <name>John Doe</name>
       <age>43</age>
       <sex>M</sex>
       <department>Analyst</department>
      </employee>
      <employee="risc_32">
       <name>Pradeep</name>
       <age>23</age>
       <sex>M</sex>
       <department>HR</department>
      </employee>

how can this be done using a foreach loop in perl

NOTE: XML::Simple is easier for me

Any help is appreciated!

Upvotes: 1

Views: 164

Answers (2)

user1817991
user1817991

Reputation: 93

Your xml is invalid, you can't have

<employee="risc_31">

You can have something like

<employee employeeId="risc_31">

Assuming your xml is

<?xml version='1.0'?>
<employee employeeId="risc_31">
  <name>John Doe</name>
  <age>43</age>
  <sex>M</sex>
  <department>Analyst</department>
</employee>
<employee employeeId="risc_32">
  <name>Pradeep</name>
  <age>23</age>
  <sex>M</sex>
  <department>HR</department>
</employee>

You can do the following with libXML (sorry, I don't know XML::Simple)

use strict;
use XML::LibXML;

my $filename = 'data.xml';
my $parser = XML::LibXML->new();
my $xmldoc = $parser -> parse_file( $filename );

foreach my $employee( $xmldoc -> findnodes( '/employee' ) ) {
  my $employeeId = $employee -> getAttribute( 'employeeId' );

  my $name = $employee -> findnodes( './name' );
  my $age = $employee -> findnodes( './age' );
  my $sex = $employee -> findnodes( './sex' );
  my $department = $employee -> findnodes( './department' );

}

exit 0;

Upvotes: 2

Harry
Harry

Reputation: 11668

Problems with XML and Perl. Have a look at xmllint on Mac and I'd assume linux...

Think of XML as having two types of data... "Data" and "Meta Data", everything between matched ">ME ME ME<" is real data, everything between matched <"me" "me" "me"> is meta data. Mixing them up means you get errors ie xmllint showed me this

data.xml:2: parser error : error parsing attribute name
<employee="risc_31">
     ^
data.xml:2: parser error : attributes construct error
<employee="risc_31">
     ^
data.xml:2: parser error : Couldn't find end of Start Tag employee line 2
<employee="risc_31">
     ^
data.xml:2: parser error : Extra content at the end of the document
<employee="risc_31">
     ^

Change your XML to look something more like. Note I've added a "wtf" entry to this record ie I've made the "risc_nn" data as a real part of the employee record, not meta data...

<?xml version="1.0"?>
<foo>
<employee>
  <wtf>risc_31</wtf>
  <name>John Doe</name>
  <age>43</age>
  <sex>M</sex>
  <department>Analyst</department>
</employee>
<employee>
  <wtf>risc_32</wtf>
  <name>Pradeep</name>
  <age>23</age>
  <sex>M</sex>
  <department>HR</department>
  </employee>
</foo>

The following Perl program will now work

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
my @xml=new XML::Simple;
my $data=XMLin("data.xml");
print Dumper($data);

and provide the following result...

$VAR1 = { 
    'employee' => {
        'John Doe' => {
            'wtf' => 'risc_31',
            'sex' => 'M',
            'department' => 'Analyst',
            'age' => '43'
     },  
     'Pradeep' => {
            'age' => '23',
            'wtf' => 'risc_32',
            'sex' => 'M',
            'department' => 'HR'
     }   
  }   
};

If you want to loop over that it's a hash reference so use the following...

for my $key (%{$data}) {
  print $data->{$key} . "\n";
}

Or any myriad ways to do anything in Perl.

Upvotes: 0

Related Questions