Trouble Replacing Text in HTML fragment using Mojo::DOM

Question

I need to scan through html fragments looking for certain strings in text (not within element attributes) and wrapping those matching strings with a . Here's a sample attempt with output:

use v5.10;
use Mojo::DOM;

my $body = qq|

Boring Text:

Highlight Cool whenever we see it.
but not here.

    sub Cool {
        print "Foo
";
    }

And here is more Cool.


|;
my $dom = Mojo::DOM->new($body);

foreach my $e ($dom->find('*')->each) {
    my $text = $e->text;
    say "e text is:  $text ";
    if ($text =~ /Cool/) {
        (my $newtext = $text ) =~ s/Cool/Cool/g;
        $e->replace_content($newtext);
    }
}

say $dom->root;

the output:

e text is:   
e text is:  Boring Text: 
e text is:  Highlight Cool whenever we see it. but not. And here is more Cool. 
e text is:  here 
e text is:  sub Cool { print "Foo "; } 


Boring Text:
Highlight Cool whenever we see it. but not. And here is more Cool.

Close but what I really want to see is something like the following:


Boring Text:
Highlight Cool whenever we see it. but not here. 

sub Cool { 
    print "Foo
"; 
}
  
And here is more Cool.

Any help / pointers would be greatly appreciated. Thanks, Todd

Borodin · Accepted Answer

Having looked into XML::Twig I'm not so sure it's the correct tool. It's surprising how awkward such a simple task can be.

This is a working program that uses HTML::TreeBuilder. Unfortunately it doesn't produce formatted output so I've added some whitespace myself.

use strict;
use warnings;

use HTML::TreeBuilder;

my $html = HTML::TreeBuilder->new_from_content(<<__HTML__);

Boring Text:

Highlight Cool whenever we see it.
but not here.

    sub Cool {
        print "Foo
";
    }

And here is more Cool.


__HTML__

$html->objectify_text;

for my $text_node ($html->look_down(_tag => '~text')) {

  my $text = $text_node->attr('text');

  if (my @replacement = process_text($text)) {
    my $old_node = $text_node->replace_with(@replacement);
    $old_node->delete;
  }
}

$html->deobjectify_text;

print $html->guts->as_XML;

sub process_text {

  my @nodes = split /\bCool\b/, shift;
  return unless @nodes > 1;

  my $span = HTML::Element->new('span', class => 'fun');
  $span->push_content('Cool');

  for (my $i = 1; $i < @nodes; $i += 2) {
    splice @nodes, $i, 0, $span->clone;
  }

  $span->delete;

  @nodes;
}

output


Boring Text:

Highlight Cool whenever we see it.
but not here.
 sub Cool { print "Foo "; } 
And here is more Cool.

Trouble Replacing Text in HTML fragment using Mojo::DOM

Answers (2)

Related Questions