Reputation: 141
I have some XML files like the following:
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
and I want to keep the server and delete others, the output should be:
<machines>
<server>
127.0.0.1
</server>
</machines>
I wrote script as follows:
use warnings;
use strict;
use feature ':5.10';
use XML::Twig;
my $path='C:\strawberry\perl\site\lib\file.xml';
my $filehandle;
my $tweak_server =sub{
my ($twig, $root) =@_;
my $elt=$root;
while( $elt=$elt->next_elt($root)){
my $tag=$elt->tag;
say $tag;
if ($tag!~/server/){
$elt->delete($tag);
}
}
$twig->flush;
};
open( $filehandle, "+<$path") or die "cannot open out file out_file:$!";
my $roots = { machines => 1 };
my $handlers = { 'machines' => $tweak_server,
};
my $twig = new XML::Twig(TwigRoots => $roots,
TwigHandlers => $handlers,
pretty_print => 'indented'#,
# twig_print_outside_roots => \*$filehandle
);
$twig->parsefile($path);
close $filehandle;
and got the output:
server
#PCDATA
<machines>
<server></server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
I really don't understand why there is "#PCDATA" and why it doesn't work as I expect?
@mirod I tried as follows:
use warnings;
use strict;
use feature ':5.10';
use XML::Twig;
my $tweak_server =sub{
my ($twig, $root) =@_;
my $elt=$root;
my $text=$elt->first_child_text('id');
if ($text=~m/12/){
while( $elt=$elt->next_elt('#ELT')){
my $tag=$elt->tag;
say $tag;
if ($tag!~/id/){
$elt->delete;
}
}
}
};
my $roots = { machines => 1 };
my $handlers = { 'machines/aaa' => $tweak_server,
};
my $twig =XML::Twig->new(TwigRoots => $roots,
TwigHandlers => $handlers,
pretty_print => 'indented'#,
# twig_print_outside_roots => \*$filehandle
)
->parse( \*DATA)
->print;
__DATA__
<machines>
<server> 127.0.0.1 </server>
<aaa>
<id>12</id>
<ip>127.0.0.2</ip>
<option>127.0.0.6</option>
<etc>abc</etc>
</aaa>
<aaa>
<id>14</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
<aaa>
<id>15</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
</machines>
and the output is :
<machines>
<server> 127.0.0.1 </server>
<aaa>
<id>12</id>
<option>127.0.0.6</option>
<etc>abc</etc>
</aaa>
<aaa>
<id>14</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
<aaa>
<id>15</id>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</aaa>
</machines>
and what I want is to delete the three elements, not just one:
<ip>127.0.0.2</ip>
<option>127.0.0.6</option>
<etc>abc</etc>
under the element
<id>12</id>
any suggestion?
Upvotes: 2
Views: 2540
Reputation: 16161
If your requirement is to keep only the server elements, then you can tell the module by having them as twig_roots
. this will have the effect of keeping the root of the XML and the server elements (and their content), while discarding all the rest:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
XML::Twig->new( twig_roots => { server => 1 },
pretty_print => 'indented',
)
->parse( \*DATA)
->print;
__DATA__
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
Upvotes: 2
Reputation: 62037
The following will delete the proxy
elements:
use warnings;
use strict;
use XML::Twig;
my $str = '
<machines>
<server>
127.0.0.1
</server>
<proxy>
<ip>127.0.0.2</ip>
<etc>abc</etc>
</proxy>
</machines>
';
my $t = XML::Twig->new(
twig_handlers => {
proxy => sub { $_->delete() },
},
pretty_print => 'indented',
);
$t->parse($str);
$t->print($str);
print "\n";
__END__
<machines>
<server>
127.0.0.1
</server>
</machines>
If you don't want to print out server
and #PCDATA
, then get rid of say $tag;
.
Upvotes: 2