Reputation: 377
I have an XML file:
<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE tmx SYSTEM "56.dtd">
<body>
<tu changedate="20130625T175037Z"">
<tuv xml:lang="pt-pt">
<prop type="x-context-pre"><seg>Some text.</seg></prop>
<prop type="x-context-post"><seg>Other text.</seg></prop>
<seg>The text I'm interested.</seg>
</tuv>
<tuv xml:lang="it">
<seg>And it's translation in italian.</seg>
</tuv>
</tu>
.... followed by other <tu>'s
</body>
Since it's a huge file, I'm using XML::Twig
to parse it and get the parts I'm interested in. I'm particularly interested in seg
's node content as well as tu
's node attribute.
Here's the code I've got so far:
use 5.010;
use strict;
use warnings;
use XML::Twig;
my $filename = 'filename.tmx';
my $out_filename = 'out.xml';
open my $out, '>', $out_filename;
binmode $out;
my $original_twig = new XML::Twig (pretty_print => 'nsgmls', twig_handlers => {tu => \&original_tu});
$original_twig->parsefile($filename);
sub original_tu {
my($twig, $original_tu) = @_;
my $original_seg = $original_tu-> first_child('./tuv/seg')->text;
}
Perl (or should I say XML::Twig
) tells me that I've got:
wrong navigation condition './tuv/seg' ()
Does anyone know how to access the seg
node's text and how to access the changedate
attribute of tu
's node?
Upvotes: 3
Views: 1234
Reputation: 16161
You can't use a complete XPath expression with first_child
, just a single XPath step (ie you can only go down 1 level).
To use an XPath expression you need to use findnodes
: my $original_seg = $original_tu->findnodes('./tuv/seg', 0)->text
(the ,0
gets the first element of the (potential) list of hits.
To access an attribute, use $original_tu->att( 'date')
Upvotes: 1
Reputation: 241858
The condition used in first_child
cannot use XPath. See https://metacpan.org/module/XML::Twig#cond for details. The method would have been misnamed if it did - first_child
returns a child, but seg
is a grandchild of tu
.
You can use first_descendant('seg')
instead.
To access the attribute, use the $original_tu->att('changedate')
method.
Upvotes: 0
Reputation: 62037
Here is one way to access that node and attribute:
my $original_seg = $original_tu->first_child('tuv')->first_child('seg')->text;
my $date = $original_tu->att('changedate');
Upvotes: 2