Reputation: 14736
I have to load some data from external sources. When I look at the encoding, Ruby tells me ASCII-8BIT
, binary file. However, some of the sources are encoded ISO-8859-1
and some of them are in UTF-8
. When I try to convert the ISO-8859-1
encoded stuff to UTF-8
, I get an error. But when I do something like content.force_encoding('ISO-8859-1').encode('UTF-8')
everything works fine.
However, this doesn't work the other way round. When I try to encode the UTF-8 data to ISO, it ends up with broken characters like 
.
So, is there a way to detect the "underlying" encoding of the
ASCII-8BIT
data, and then convert it toUTF-8
?
Upvotes: 1
Views: 585
Reputation: 2450
I had a quick google and found the Charlock Holmes gem by Brian Lopez. It looks like it does the detection process you're after.
https://github.com/brianmario/charlock_holmes
Upvotes: 1