Reputation: 11871
It is possible to write Perl documentation in UTF-8. To do it you should write in your POD:
=encoding NNN
But what should you write instead NNN
? Different sources gives different answers.
=encoding utf8
=encoding UTF-8
=encoding utf-8
What is the correct answer? What is the correct string to be written in POD?
Upvotes: 8
Views: 1152
Reputation: 118605
As daxim points out, I have been misled. =encoding=UTF-8
and =encoding=utf-8
apply the strict encoding, and =encoding=utf8
is the lenient encoding:
$ cat enc-test.pod
=encoding ENCNAME
=head1 TEST '\344\273\245\376\202\200\200\200\200\200'
=cut
(here \xxx
means the literal byte with value xxx
. \344\273\245
is a valid UTF-8 sequence, \376\202\200\200\200\200\200
is not)
=encoding=utf-8
:$ perl -pe 's/ENCNAME/utf-8/' enc-test.pod | pod2cpanhtml | grep /h1
>TEST '以此�'</a></h1>
=encoding=utf8
:$ perl -pe 's/ENCNAME/utf8/' enc-test.pod | pod2cpanhtml | grep /h1
Code point 0x80000000 is not Unicode, no properties match it; ...
Code point 0x80000000 is not Unicode, no properties match it; ...
Code point 0x80000000 is not Unicode, no properties match it; ...
>TEST '以�'</a></h1>
They are all equivalent. The argument to =encoding
is expected to be a name recognized by the Encode::Supported
module. When you drill down into that document, you see
utf8
UTF-8
is an alias for utf8
, andutf-8
is equivalent to UTF-8
What's the best practice? I'm not sure. I don't think you go wrong using the official IANA name (as per daxim's answer), but you can't go wrong following the official Perl documentation, either.
Upvotes: 4
Reputation: 39158
=encoding UTF-8
According to IANA, charset names are case-insensitive, so utf-8
is the same.
utf8
is Perl's lax variant of UTF-8. However, for safety, you want to be strict to your POD processors.
Upvotes: 15