Reputation: 31
eg: é
into é
Sometimes user getting ascii format character set rather than french character set... So can any one assist Me is there any function in perl that can convert ascii to UTF-8
Upvotes: 2
Views: 8178
Reputation: 6566
This is best handled by Perl's built in Encode
module. Here is a simple example of how to convert a string:
my $standard_string = decode("ascii", $ascii_string);
($standard_string
will then be in whatever Perl's standard encoding is on your system. In other words, you shouldn't have to worry about it from that point on).
The linked documentation gives many other examples of things you can do--such as setting the encoding of an input file. A related useful module is Encode::Guess
, which helps you determine the character encoding if it is unknown.
Upvotes: 4
Reputation: 50324
It sounds like you want to convert HTML entities into UTF-8. To do this, use HTML::Entities and the decode_entities
function.
This will give you a Perl string with no specific encoding attached. To output the string in UTF-8 encoding:
print Encode::encode_utf8(decode_entities($html_string));
Alternatively, set the UTF-8 PerlIO layer on STDOUT and Perl will encode everything in UTF-8 for you - useful if outputting multiple strings.
binmode STDOUT, ':utf8';
print decode_entities($html_string);
Upvotes: 4