Reputation: 2452
My character set is 'iso-8859-1'. I cannot care if it is bad or not but I have to use it as such.
How to declare that my perl source code is encoded in iso-8859-1 and not in utf-8?
#! /urs/bin/perl
# encoding of this document is iso-8859-1
# well how to make it understood by perl?
Thanks for your help and time.
Upvotes: 1
Views: 1313
Reputation: 57590
Perl assumes Latin-1 (ISO-8859-1) by default. You can likely use the script without specifying any encoding.
However, you should really start using UTF-8. Any editor worth it's salt will be able to handle UTF-8 (even notepad.exe does!), and any sensible tool will treat the contents of your file as opaque bytes, not as 7 bit ASCII (an assumption already broken by your use of Latin-1).
If you only need to interface with resources encoded in Latin-1, you don't have to write your script in this encoding. Use the Encode
module to encode/decode manually, or add an encoding layer to filehandles: open my $fh "<:encoding(NAME)", $resource
or binmode $fh, ":encoding(NAME)"
.
Upvotes: 5
Reputation: 370
Perl Strings can either be used to hold text strings or binary data. Given a string, you generally have no mechanism of finding out whether it holds text or binary data - you have to keep track of it yourself.
Interaction with the environment (like reading data from STDIN or a file, or printing it) treats strings as binary data. The same holds true for the return value of many built-in functions (like gethostbyname) and special variables that carry information to your program (%ENV and @ARGV).
Other builtin functions that deal with text (like uc and lc and regular expressions) treat strings as text, or more accurately as a list of Codepoints.
With the function decode in the module Encode you decode binary strings to make sure that the text handling functions work correctly.
More on Perl, charsets and encodings
Best regards
Upvotes: 0