user1463382
user1463382

Reputation: 63

Disabling HTML entities expanding in HTML::TreeBuilder Perl module

Consider the following script:

#!/usr/bin/perl

use strict;

use HTML::TreeBuilder;

sub test
{
  my ($content) = @_;

  my $tree = HTML::TreeBuilder->new;
  $tree->implicit_tags(0);
  $tree->no_expand_entities(1);
  $tree->parse_content($content);

  return $tree->as_HTML(q{<>&});
}

print test('test&laquo;');
print "\n";
print test('<a href="#" title="&laquo;"></a>')

It'll print:

<html>test&laquo;</html>
<html><a href="#" title="?"></a></html>

Due to calling no_expand_entities(1) HTML entity &laquo; is not expanded within HTML element. However for some reason this mode does not change default behavior for attributes - the same entity is expanded and displayed as garbage.

Could you please advise how to force disabling entities expansion within HTML attributes?

Upvotes: 0

Views: 286

Answers (1)

Slaven Rezic
Slaven Rezic

Reputation: 4581

As a workaround, you can call

$tree->attr_encoded(1);

before calling the parser. This would disable HTML::Parser's automatic decoding of attributes.

But best is to ask the author of HTML::TreeBuilder (e.g. via rt.cpan.org) to do this automatically if no_expand_entities is set.

Upvotes: 1

Related Questions