Yad Smood
Yad Smood

Reputation: 2932

How to correct a incorrect encoded string in Ruby?

For instance, we need a third party lib to parse and get a file meta data. But the method will decode all the meta data via utf-8, even if the meta data is encoded in another encoding, it will return us a utf-8 encoded string. And the lib doesn't support any method to return a raw string data for us to encode it correctly. Now we know the file's original encoding of the meta data is, for example, GBK. Is there a way to correct the utf-8 encoded string to GBK?

Upvotes: 0

Views: 275

Answers (2)

Esailija
Esailija

Reputation: 140210

No there isn't, decoding something as UTF-8 that isn't in UTF-8 is lossy. That means, by the time you get the string from the lib, you have lost information and can't represent the original data as GBK. Change how the lib works, or change the file meta data to UTF-8.

Upvotes: 1

David Grayson
David Grayson

Reputation: 87386

Yes. You should learn about ruby 1.9's force_encoding and encode methods on the string class. I recommend converting everything to actually be UTF-8 as soon as possible before manipulating it in ruby.

Upvotes: 1

Related Questions