user419017
user419017

Reputation:

Convert unicode mess to correct characters in Ruby?

I have a string such as:

"MÃ\u0083¼LLER".encoding
#<Encoding:UTF-8>   

"MÃ\u0083¼LLER".inspect    
"\"MÃ\\u0083¼LLER\""  

What can I do to salvage such a string? Take into consideration I do not have the original data. Is this salvageable?

Upvotes: 0

Views: 770

Answers (1)

Patrick Oscity
Patrick Oscity

Reputation: 54684

Looks like the string was converted from utf-8 to latin-1 twice. Try this on some of your data and let me know if it worked:

require 'iconv'

def decode(str)
  i = Iconv.new('LATIN1','UTF-8')
  i.iconv(i.iconv(str)).force_encoding('UTF-8')
end

decode("MÃ\u0083¼LLER")
#=> "MüLLER"

Upvotes: 4

Related Questions