Perl : Handling duplicate element names using XML::SAX

Question

How to handle duplicate element names in perl XML::SAX module ? Following is my xml file:

My question is how to access the element employees->employee->company->name? (I should be able to print "abc" and "xyz").The reason I am asking this is because there is one more 'name' element at employees->employee->name which i want to skip. I would like to use XML::SAX only as my environments only supports this module. Please help. Thanks a lot.

flesk · Accepted Answer

Use a stack to keep record of which nodes you're within by pushing every time you enter a node, and poping every time you leave a node:

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use XML::SAX::ParserFactory;
use XML::SAX::PurePerl;

my (@nodes, $characters, @names);

my $factory = new XML::SAX::ParserFactory;
my $handler = new XML::SAX::PurePerl;
my $parser = $factory->parser(
                  Handler => $handler,
                  Methods => {
                  start_element => sub {
                      push @nodes, shift->{LocalName};
                  },
                  characters => sub {
                      $characters = shift->{Data};
                  },
                  end_element => sub {
                      if (shift->{LocalName} eq 'name' && $nodes[-2] eq 'company') {
                          push @names, $characters;
                      }
                      pop @nodes;
                  }
              }
              );
$parser->parse_uri("sample2.xml");

print Dumper \@names;

Output:

$VAR1 = [
          ' abc ',
          ' xyz '
        ];

$nodes[-2] is the second to last element in @nodes and will resolve to 'employee' or 'company' when shift->{LocalName} equals 'name'

Perl : Handling duplicate element names using XML::SAX

Answers (1)

Related Questions