Reputation: 12402
How can I get a list of compatible encodings for a Ruby String? (MRI 1.9.3)
Use case: I have some user provided strings, encoded with UTF-8. Ideally I need to convert them to ISO/IEC 8859-1
(8-bit), but I also need to fallback to unicode when some special characters are present.
Also, is there a better way to accomplis this? Maybe I am testing the wrong thing.
EDIT- adding more details
Tanks for the answers, I should probably add some context.
I know how to perform encoding conversion.
I'm looking for a way to quickly find out if a string can be safely encoded to another encoding or, to put it in another (and quite wrong) way, what is the minimum encoding to support all the characters in that string.
Just converting the strings to 16-byte is not an option, because they will be sent as SMSs and converting them to a 16-byte encoding cuts the amount of available characters from 160 down to 70.
I need to convert them to 16-bytes only when they contain a special character which is not supported in ISO/IEC 8859-1
.
Upvotes: 3
Views: 2257
Reputation: 3413
Unluckily, Ruby’s ideas of encoding compatibility are not fully congruent with your use case. However, trying to encode your UTF-8 string in ISO-8859-1 and catching the error that is thrown when a conversion is not possible will achieve what you are after:
begin
'your UTF-8 string'.encode!('ISO-8859-1')
rescue Encoding::UndefinedConversionError
end
will convert your string to ISO-8859-1 if possible and leave it as UTF-8 if not.
Note this uses encode
, which actually transcodes the string using Encoding::Converter
(i.e. reassigns the correct encoding byte pattern to the character representations of the string), unlike force_encoding
, which just changes the encoding flag (i.e. tells Ruby to interpret the string’s byte stream according to the set encoding).
Upvotes: 4
Reputation: 80065
Is valid_encoding?
(instance method of String) useful? That is:
try_str = str.force_encoding("ISO/IEC 8859-1")
str = try_str if try_str.valid_encoding?
Upvotes: 2
Reputation: 4193
Ruby has standard library in which u can find class Encoding and his sub-class called Encoding::Converter they are probably your best friends in this case.
#!/usr/bin/env ruby
# encoding: utf-8
converter = Encoding::Converter.new("UTF-8", "ISO-8859-1")
converted = converter.convert("é")
puts converted.encoding
# => ISO-8859-1
puts converted.dump
# => "\xE9"
Upvotes: 2
Reputation: 11876
"Some String".force_encoding("ISO/IEC 8859-1")
Also you can refer rails encoding link
Upvotes: -1
Reputation: 3005
To convert to ISO-8859-1
you can follow the below code to encode it.
1.9.3p194 :002 > puts "é".force_encoding("ISO-8859-1").encode("UTF-8")
é
=> nil
Upvotes: -1