Reputation: 1933
Edited: Sorry, I mistyped 'name' when I mean 'ref' and I've included the complete attributes as well
I have a number of xml files that contain, on a single line, a complete xml document. An example would be:
<Reqeusts>
<WRRequest><Request domain="foo.com"><Rows><Row includeascolumn="n" interval="hour" ref="time" type="group"/><Row includeascolumn="n" ref="domain_id" type="group"/><Row />...</Rows><Columns><Column ref="user_id"/><Column ref="country_id"/><Column ref="country_name"/>...</Columns></Request></WRRequest>
.
.
.
</Requests>
There are a number of attributes as well that I'm not including for the sake of clarity.
I'm parsing this using XML::Parser & XML::SimpleObject which work fine up to a point. For instance, I'm just printing out the attributes of each of the elements which works except when I try to print out the 'ref' attribute of the column element. Then I get an "uninitialized variable" error. The code is:
#!/usr/bin/perl
use warnings;
use diagnostics;
use XML::Parser;
use XML::SimpleObject;
use Cwd;
if ($ARGV[0] eq "") {
die "usage: sumXML.pl <input file> \n";
}
my $fileName = $ARGV[0];
my $parser = new XML::Parser(Style => 'Tree');
my $xso = XML::SimpleObject->new( $parser->parsefile("$fileName") );
foreach my $wrRequest ($xso->child('WRRequests')->children('RWRequest')) {
print "Client Name: " . $wrRequest->attribute('clientName') . "\n";
foreach my $xmlRequest ($wrRequest->child('REQUEST')) {
print "Domain name: " . $xmlRequest->attribute('domain') . "\n";
print "Service: " . $xmlRequest->attribute('service') . "\n";
foreach my $xmlRow ($xmlRequest->child('ROWS')->children('ROW')) {
print "Row Reference: " . $xmlRow->attribute('ref') . "\n";
}
foreach my $xmlColumn ($xmlRequest->child('COLUMNS')->children('COLUMN')) {
print "Column Reference: " . $xmlColumn->attribute('ref') . "\n";
}
}
print "\n";
}
Upvotes: 2
Views: 846
Reputation: 6524
I can't know for sure how the data should really be ideally organized, but I find XML::Rules handy in these situations. If you're open to a completely different way of doing it, e.g. (I'm assuming 'ref' is the key in each row, column names should be kept in order and that all you care about is the 'ref' attribute, etc.):
use strict;
use warnings;
use Data::Dumper;
use XML::Rules;
my $xml = <<XML;
<Requests>
<WRRequest>
<Request domain="foo.com" service="SomeService">
<Rows>
<Row includeascolumn="n" interval="hour" ref="time" type="group"/>
<Row includeascolumn="n" ref="domain_id" type="group"/>
</Rows>
<Columns>
<Column ref="user_id"/>
<Column ref="country_id"/>
<Column ref="country_name"/>
</Columns>
</Request>
</WRRequest>
</Requests>
XML
my @rules = (
Request => sub { delete $_[1]->{_content}; print Dumper $_[1]; return },
Rows => 'pass no content',
Columns => 'pass no content',
Row => 'no content by ref',
Column => sub { '@'.$_[0] => $_[1]{ref} },
);
my $p = XML::Rules->new(
rules => \@rules,
);
$p->parse($xml);
__END__
$VAR1 = {
'Column' => [
'user_id',
'country_id',
'country_name'
],
'domain' => 'foo.com',
'time' => {
'type' => 'group',
'includeascolumn' => 'n',
'interval' => 'hour'
},
'domain_id' => {
'type' => 'group',
'includeascolumn' => 'n'
},
'service' => 'SomeService'
};
Upvotes: 1
Reputation: 12537
Your sample data does not parse (even if you remove the dots) so it is not valid XML. I'm not sure how your actual data looks like but this is quite important to find the problem.
I'm certain that there is nothing wrong with XML::Parser
or XML::SimpleObject
. So please check the following:
REQUEST
-element have a service
-attribute? Does every ROW
have a ref
-attribute?). If they do not exist you have to either reject the input data or deal with the data you have. This of course depends on your requirements.I have actually taken the time to make it work (by just changing the case of the element-names, and slightly modifying your "sample data"):
use strict;
use warnings;
use XML::Parser;
use XML::SimpleObject;
use Cwd;
my $inXML = join "", <DATA>;
print $inXML;
my $parser = new XML::Parser(Style => 'Tree');
my $xso = XML::SimpleObject->new( $parser->parse($inXML) );
foreach my $wrRequest ($xso->child('Requests')->children('WRRequest')) {
print "Client Name: " . $wrRequest->attribute('clientName') . "\n";
foreach my $xmlRequest ($wrRequest->child('Request')) {
print "Domain name: " . $xmlRequest->attribute('domain') . "\n";
print "Service: " . $xmlRequest->attribute('service') . "\n";
foreach my $xmlRow ($xmlRequest->child('Rows')->children('Row')) {
print "Row Reference: " . $xmlRow->attribute('ref') . "\n";
}
foreach my $xmlColumn ($xmlRequest->child('Columns')->children('Column')) {
print "Column Reference: " . $xmlColumn->attribute('ref') . "\n";
}
}
print "\n";
}
__DATA__
<Requests>
<WRRequest clientName="foo">
<Request service="fooService" domain="foo.com">
<Rows>
<Row includeascolumn="n" interval="hour" ref="time" type="group"/>
<Row includeascolumn="n" ref="domain_id" type="group"/>
</Rows>
<Columns>
<Column ref="user_id"/>
<Column ref="country_id"/>
<Column ref="country_name"/>
</Columns>
</Request>
</WRRequest>
</Requests>
Output:
Client Name: foo
Domain name: foo.com
Service: fooService
Row Reference: time
Row Reference: domain_id
Column Reference: user_id
Column Reference: country_id
Column Reference: country_name
I've tested it also with multiple WRRequest
-elements (copy&paste) - worked like a charm.
Upvotes: 1