saumya
saumya

Reputation: 13

How to print XML Tag's attribute data and tag value sequentially in perl?

Say I have an example XML Document,

<root>
    <subnode1 att1="sn1att1" att2="sn1att2">Subnode 1</subnode1>    
    <subnode2 att1="sn2att1" att2="sn2att2">Subnode 2</subnode2>
    <subnode3 att1="sn3att1" att2="sn3att2">
        <subnode31 att1="sn31att1" att2="sn31att2">
            <subnode311 att1="sn311att1" att2="sn311att2">
                <subnode3111 att1="sn3111att1" att2="sn3111att2">Subnode 3-111</subnode3111>
            </subnode311>
        </subnode31>
        <subnode32 att1="sn32att1" att2="sn32att2">Subnode 3-2</subnode32>
    </subnode3>
</root>

I want to print something like this

sn1att1  sn1att2  Subnode 1
sn2att1  sn2att2  Subnode 2
sn3att1  sn3att2 
sn31att1  sn31att2 
sn311att1  sn311att2  
sn3111att1  sn3111att2  Subnode 3-111
sn32att1  sn32att2  Subnode 3-2

I have written below code, which is able to print the attributes as described but not able to print the tag value (for example "Subnode 1","Subnode 2",etc).

use XML::XPath;
use XML::XPath::XMLParser;

my $xp = XML::XPath->new( filename => 'raw1.xml' );

for my $node ( $xp->findnodes('*/*') ) {

    print "\n" . $node->getName . "\t";

    for my $attribute ( $node->getAttributes ) {
        print " " . $attribute->getData;
    }

    for my $property ( $node->findnodes('.//*') ) {

        print "\n" . $property->getName . "\t";

        for my $attributes ( $property->getAttributes ) {
            print " " . $attributes->getData;
        }
    }

}

Upvotes: 1

Views: 1419

Answers (1)

Borodin
Borodin

Reputation: 126752

I think this does what you want

I'm not very familiar with XML::XPath, but I do know XPath

It looks like, for each element in the XML, you want to print a line that contains the values of each of the attributes, and of all child text nodes if there are any

That's not so simple as it may seem, as any element may contain multiple text children interspersed with multiple child elements

This code accumulates the values of all attributes and all non-blank text children into array @line and prints the line if the result isn't empty

I don't understand why your required output doesn't include my line

sn32att1 sn32att2 Subnode 3-2

Perhaps you will explain?

use strict;
use warnings 'all';

use XML::XPath;

my $xp = XML::XPath->new( filename => 'raw1.xml' );

# for all elements in the data
#
for my $node ( $xp->findnodes('//*') ) {

    my @line;

    # all the attributes of this element
    #
    for my $attr ( $node->getAttributes ) {
        push @line, $attr->getData;
    }

    # and all the non-blank child text nodes of this element
    #
    for ( $node->findnodes('text()') ) {
        my $text = $_->getData;
        push @line, $text if $text =~ /\S/;
    }

    # print it if there's anything to print
    #
    print "@line\n" if @line;
}

output

sn1att1 sn1att2 Subnode 1
sn2att1 sn2att2 Subnode 2
sn3att1 sn3att2
sn31att1 sn31att2
sn311att1 sn311att2
sn3111att1 sn3111att2 Subnode 3-111
sn32att1 sn32att2 Subnode 3-2

Upvotes: 1

Related Questions