AgA
AgA

Reputation: 2126

special chars generated when using HTML::TreeBuilder & HTML::Element

I've two questions:

Upvotes: 0

Views: 348

Answers (1)

Anomie
Anomie

Reputation: 94834

I have two answers:

  • Assuming that you want the content of $a to be the same as the content of $node, you do not need to encode_entities as push_content inserts the passed string as a text node rather than parsing it as markup. OTOH, if the content of $node is <span> (represented in HTML source as &lt;span&gt;) and you actually want $a to display &lt;span&gt; (represented in HTML source as &amp;lt;span&amp;gt;), you would call encode_entities on it.
  • Chances are that your input text contains raw UTF-8 characters which the code is interpreting as Latin-1 or a similar encoding. The "single space" characters are actually U+00A0, non-breaking space, which is represented in UTF-8 by the two bytes 0xc2 0xa0, which when interpreted in Latin-1 are "Â" and non-breaking space.

Upvotes: 2

Related Questions