Reputation:
I have a string such as:
"MÃ\u0083¼LLER".encoding
#<Encoding:UTF-8>
"MÃ\u0083¼LLER".inspect
"\"MÃ\\u0083¼LLER\""
What can I do to salvage such a string? Take into consideration I do not have the original data. Is this salvageable?
Upvotes: 0
Views: 770
Reputation: 54684
Looks like the string was converted from utf-8 to latin-1 twice. Try this on some of your data and let me know if it worked:
require 'iconv'
def decode(str)
i = Iconv.new('LATIN1','UTF-8')
i.iconv(i.iconv(str)).force_encoding('UTF-8')
end
decode("MÃ\u0083¼LLER")
#=> "MüLLER"
Upvotes: 4