Reputation: 77
I'm trying to parse a huge xml file with some similar tag. For the moment I can only parse the first tag and the first_child
Here is an example of the xml:
<?xml version="1.0" encoding="UTF-8"?>
<test version="1.0">
<parameters/>
<category name="z1" description="jobs currently running" count="30" timestamp="2010-01-16T14:24:31">
<jobs name="ZEI018CL" owner="A" type="auto" activityLevel="147" threadId="202" pid="20521" vmName="[email protected]:6102:xxx" cpuUsage="0"/>
<job name="ZUA002B" owner="A" type="auto" activityLevel="3375" threadId="194" pid="20521" vmName="[email protected]:6102:xxx" cpuUsage="0"/>
<job name="ZZZ855" owner="A" type="auto" activityLevel="0" threadId="107" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZKA019CL" owner="A" type="auto" activityLevel="0" threadId="105" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZIN41B" owner="A" type="auto" activityLevel="3" threadId="104" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZIN198CL" owner="A" type="auto" activityLevel="0" threadId="103" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZHO060" owner="A" type="auto" activityLevel="61" threadId="102" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZEI019CL" owner="A" type="auto" activityLevel="0" threadId="101" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZEI013CL" owner="A" type="auto" activityLevel="0" threadId="99" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZEI011CL" owner="A" type="auto" activityLevel="0" threadId="98" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZEC007CL" owner="A" type="auto" activityLevel="0" threadId="97" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/>
<job name="ZEC001B" owner="A" type="auto" activityLevel="2" threadId="96" pid="20457" vmName="[email protected]:6101:xxx" cpuUsage="0"/></category>
<category name="z3" description="Batchjobs" count="0" timestamp="2015-01-16T14:24:31"/>
<category name="z4" description="Interactivejobs jobs currently running in the system" count="498" timestamp="2015-01-16T14:24:31">
<job name="CAS" owner="PA" type="interactive" activityLevel="0" threadId="14624" pid="23771" vmName="[email protected]:6104:xxx" cpuUsage="0"/>
<job name="CR" owner="K" type="interactive" activityLevel="0" threadId="14586" pid="23771" vmName="[email protected]:6104:xxx" cpuUsage="0"/>
<job name="MM" owner="DU" type="interactive" activityLevel="0" threadId="14570" pid="23771" vmName="[email protected]:6104:xxx" cpuUsage="0"/>
<job name="ZZ" owner="D" type="interactive" activityLevel="0" threadId="14568" pid="23771" vmName="[email protected]:6104:xxx" cpuUsage="0"/></category>
<category name="services" description="The status" timestamp="2015-01-16T14:24:31">
<service name="1" description="test1" port-status="up" thread-status="up"/>
<service name="2" description="test2" port-status="up" thread-status="up"/>
<service name="3" description="test3" port-status="N/A" thread-status="up"/>
<service name="4" description="test4" port-status="up" thread-status="up"/></category></test>
For the first line I
my $parser = XML::Twig->new();
$parser->parsefile($xml);
For the first line I use
my $count = $parser->root->first_child('category')->att('count');
print $count;
For the next line this one
my $service = $parser->root->first_child('category')->first_child('job')->att('name');
print $service;
But I can't figure out how to get the port-status for a specific name like:
Or for a specific job name the type in the 2nd tag.
Can you help me ?
Upvotes: 0
Views: 434
Reputation: 2935
My guess is you want something like this:
foreach ($parser->root->children('section[@name="1"]')){
print join ", ", @{$_->atts}{'port-status', 'thread-status'}
}
with children('section[@name="1"]')
you get all section
elements whose name
attribute is 1
.
Then you ask with the atts
method for hash reference of that element and extract port-status
and thread-status
Edit: sorry fixed, forgot that you'll get more than one with children.
Upvotes: 0
Reputation: 16171
In your case the easiest is probably to use XPath to get what you want:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig::XPath;
my( $service, $infile)= @ARGV;
my $t= XML::Twig->new()
->parsefile( $infile);
# get the service first, then the attribute
# note the \@'s, where Perl and XPath syntaxes collide
my @services= $t->findnodes( qq{//service[\@name="$service"]});
my $status= $services[0]->att( 'port-status');
print "status: $status\n";
# get it in one swell XPath query
my $status2= $t->findvalue( qq{//service[\@name="$service"]/\@port-status});
print "status: $status2\n";
If your XML file is really huge, and depending on what you need to do, there may be better alternatives though, using handlers. It's hard to tell from your example.
Upvotes: 1