mivk
mivk

Reputation: 14889

XML::Twig replace previous comment

I have comments which precede the element I'm processing, and would like to replace them with new comments.

I can add a new comment using insert_new_elt(before ...), but cannot find a way to get the old comment and replace that.

#!/usr/bin/perl
use common::sense;
use XML::Twig;

my $twig = XML::Twig->new(
    twig_roots    => { 'el' => sub { process_el(@_) } },
    comments      => "process",
    pretty_print => "indented_c",
    twig_print_outside_roots => 1,
);

$twig->parse(join('', <DATA>)) or die "Could not parse\n";
$twig->flush();

sub process_el {
    my( $t, $e)= @_;
    my $text   = $e->text;
    # replace old comment before this element ?
    $e->insert_new_elt( before => '#COMMENT', "new comment on $text");
    $e->flush();
}

__DATA__
<?xml version="1.0" encoding="utf-8"?>
<root>
  <!-- old comment 1 -->
  <el>element 1</el>
  <el>element 2 without comment before</el>
  <!-- old comment 3 -->
  <el>element 3</el>
</root>

(I also need to detect if there actually is a comment right before the element. If not, I will obviously not be able to replace it)

I tried prev_sibling, but that gave me the previous element, not the comment in-between.

The above code works to insert the new comment, but leaves the old one in place, which I don't want.

Upvotes: 1

Views: 406

Answers (2)

choroba
choroba

Reputation: 241918

Alternative approach, using XML::XSH2, a wrapper around XML::LibXML:

open file.xml ;
for //el {
    my $c = (preceding-sibling::* | preceding-sibling::comment() )[last()] ;
    if $c/self::comment() delete $c ;
    insert comment text() before . ;
}
save :b ;

Upvotes: 1

mirod
mirod

Reputation: 16161

The problem comes from using twig_roots: the comments are not precessed, since they are not a root, so they are never really seen by XML::Twig, just printed asis.

So you need to use twig_handlers instead of twig_roots, and remove the twig_print_outside_roots option. Then if you still use flush, you run into formating problems, the comments are printed on the same line as the previous element. I don't know how important it is for you to get the specific format you showed.

In order to get exactly what you wanted, I removed the flush and used a simple print after the parse. Depending on your constraints (big XML file for example), you may want to use flush and if need be use xml_pp on the result to get the format you want (it works fine).

#!/usr/bin/perl
use common::sense;
use XML::Twig;

my $twig = XML::Twig->new(
    twig_handlers    => { 'el' => sub { process_el(@_) } },
    comments      => "process",
    pretty_print => "indented",
);

$twig->parse(join('', <DATA>)) or die "Could not parse\n";
$twig->print();

sub process_el {
    my( $t, $e)= @_;
    my $text   = $e->text;
    if( $e->prev_sibling && $e->prev_sibling->is( '#COMMENT'))
      { $e->prev_sibling->cut; }
    # replace old comment before this element ?
    $e->insert_new_elt( before => '#COMMENT', "new comment on $text");
}

__DATA__
<?xml version="1.0" encoding="utf-8"?>
<root>
  <!-- old comment 1 -->
  <el>element 1</el>
  <el>element 2 without comment before</el>
  <!-- old comment 3 -->
  <el>element 3</el>
</root>

Upvotes: 2

Related Questions