Reputation: 57
I am using perl module XML::LibXML.
XML::LibXML get string, which contains whole xml file with utf8 encoding. I get information from xml using findnodes and textContent. But when I try to print them into HTML page using charset=UTF-8 it comes with bad characters like "�". When I dont use charset=UTF-8 in head of html page, it is correct but rest of page,which I print manually, is wrong. Can you please help me to figure it out?
Thanks for advice.
Upvotes: 2
Views: 1896
Reputation: 385754
As it should, textContent
returns the text in its "decoded" form (Unicode Code Points). File handles expect bytes, so you need to encode the text into bytes. You can instruct Perl to do so for you using
use open ':std', ':encoding(UTF-8)';
Upvotes: 1
Reputation: 185073
Ensure you have on the top of your script :
use utf8;
binmode $_, ":utf8" for qw/STDOUT STDIN STDERR/;
Upvotes: 0