user3366045
user3366045

Reputation: 57

perl XML::LibXML utf8 encoding

I am using perl module XML::LibXML.

XML::LibXML get string, which contains whole xml file with utf8 encoding. I get information from xml using findnodes and textContent. But when I try to print them into HTML page using charset=UTF-8 it comes with bad characters like "�". When I dont use charset=UTF-8 in head of html page, it is correct but rest of page,which I print manually, is wrong. Can you please help me to figure it out?

Thanks for advice.

Upvotes: 2

Views: 1896

Answers (2)

ikegami
ikegami

Reputation: 385754

As it should, textContent returns the text in its "decoded" form (Unicode Code Points). File handles expect bytes, so you need to encode the text into bytes. You can instruct Perl to do so for you using

use open ':std', ':encoding(UTF-8)';

Upvotes: 1

Gilles Quénot
Gilles Quénot

Reputation: 185073

Ensure you have on the top of your script :

use utf8;
binmode $_, ":utf8" for qw/STDOUT STDIN STDERR/;

Upvotes: 0

Related Questions