tompave
tompave

Reputation: 12402

Ruby Strings, get list of compatible encodings

How can I get a list of compatible encodings for a Ruby String? (MRI 1.9.3)

Use case: I have some user provided strings, encoded with UTF-8. Ideally I need to convert them to ISO/IEC 8859-1 (8-bit), but I also need to fallback to unicode when some special characters are present.

Also, is there a better way to accomplis this? Maybe I am testing the wrong thing.


EDIT- adding more details
Tanks for the answers, I should probably add some context. I know how to perform encoding conversion.
I'm looking for a way to quickly find out if a string can be safely encoded to another encoding or, to put it in another (and quite wrong) way, what is the minimum encoding to support all the characters in that string.

Just converting the strings to 16-byte is not an option, because they will be sent as SMSs and converting them to a 16-byte encoding cuts the amount of available characters from 160 down to 70.

I need to convert them to 16-bytes only when they contain a special character which is not supported in ISO/IEC 8859-1.

Upvotes: 3

Views: 2257

Answers (5)

kopischke
kopischke

Reputation: 3413

Unluckily, Ruby’s ideas of encoding compatibility are not fully congruent with your use case. However, trying to encode your UTF-8 string in ISO-8859-1 and catching the error that is thrown when a conversion is not possible will achieve what you are after:

begin
  'your UTF-8 string'.encode!('ISO-8859-1')
rescue Encoding::UndefinedConversionError
end

will convert your string to ISO-8859-1 if possible and leave it as UTF-8 if not.

Note this uses encode, which actually transcodes the string using Encoding::Converter (i.e. reassigns the correct encoding byte pattern to the character representations of the string), unlike force_encoding, which just changes the encoding flag (i.e. tells Ruby to interpret the string’s byte stream according to the set encoding).

Upvotes: 4

steenslag
steenslag

Reputation: 80065

Is valid_encoding? (instance method of String) useful? That is:

try_str = str.force_encoding("ISO/IEC 8859-1")
str = try_str if try_str.valid_encoding?

Upvotes: 2

Oto Brglez
Oto Brglez

Reputation: 4193

Ruby has standard library in which u can find class Encoding and his sub-class called Encoding::Converter they are probably your best friends in this case.

#!/usr/bin/env ruby
# encoding: utf-8

converter = Encoding::Converter.new("UTF-8", "ISO-8859-1")
converted = converter.convert("é")

puts converted.encoding
# => ISO-8859-1

puts converted.dump
# => "\xE9"

Upvotes: 2

Sachin R
Sachin R

Reputation: 11876

"Some String".force_encoding("ISO/IEC 8859-1")

Also you can refer rails encoding link

Upvotes: -1

Nishant
Nishant

Reputation: 3005

To convert to ISO-8859-1 you can follow the below code to encode it.

1.9.3p194 :002 > puts "é".force_encoding("ISO-8859-1").encode("UTF-8")
é
 => nil 

Linked Answer

Upvotes: -1

Related Questions