user1686082
user1686082

Reputation: 79

Convert an iso-8859-1 symbol in a string to utf-8 in Perl

Here is a sample code snippet:

my $str = '21156_MLA Ã Copy4.ens';

I'm using this $str in my code, and it should display as 21156_MLA ß Copy4.ens.

I'm uploading a file with the filename 21156_MLA ß Copy4.ens, but on the browser display it's displaying as 21156_MLA Ã Copy4.ens. In the database it's stored properly as ß, but when we are retrieving that from the DB (using fetchall_hashref) it's getting converted to Ã. Subsequently the display on the browser is 21156_MLA Ã Copy4.ens. How to avoid this conversion here?

Upvotes: 2

Views: 3094

Answers (3)

slayedbylucifer
slayedbylucifer

Reputation: 23492

Check below modules. they have hte inforation that you need:

https://metacpan.org/pod/Encode

https://metacpan.org/pod/MIME::QuotedPrint

As You haven't posted what you have tried so far, i am not going a write code from the scratch. However, Check above perl module documentation, they have information.

Upvotes: 1

Ilmari Karonen
Ilmari Karonen

Reputation: 50328

First, as PP notes, if your source file is encoded in UTF-8, you should use utf8; so that Perl will know that and interpret any string literals in it correctly.

Second, make sure the text in the database is also correctly encoded. The details of this will depend on you database, but e.g. for MySQL, the best way is probably to make sure your text columns have the utf8 character set and the utf8_unicode_ci collation (or the appropriate national collation scheme, if needed), and to include the mysql_enable_utf8 option when connecting to the database using DBI.

Third, you need to tell Perl that you want your I/O streams to be UTF-8 encoded, too. You can do this using binmode(), as in:

binmode STDOUT, ':utf8';

Finally, you also need to tell the browser that you're sending it UTF-8 text. (I suspect this part is your actual problem, but if you do all the other steps too, you'll have achieved a fully Unicode-aware workflow.) You can do this by sending the HTTP header:

Content-Type: text/html; charset=UTF-8

and/or the equivalent HTML meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

or, in HTML5, simply:

<meta charset="utf-8">

Upvotes: 0

PP.
PP.

Reputation: 10864

If your source code file is encoded as UTF-8 then you need to specify:

use utf8;

in your source code file to tell the interpreter that you may have strings embedded in your source that are UTF-8.

See http://perldoc.perl.org/utf8.html

Upvotes: 1

Related Questions