Reputation: 2932
For instance, we need a third party lib to parse and get a file meta data. But the method will decode all the meta data via utf-8, even if the meta data is encoded in another encoding, it will return us a utf-8 encoded string. And the lib doesn't support any method to return a raw string data for us to encode it correctly. Now we know the file's original encoding of the meta data is, for example, GBK. Is there a way to correct the utf-8 encoded string to GBK?
Upvotes: 0
Views: 275
Reputation: 140210
No there isn't, decoding something as UTF-8 that isn't in UTF-8 is lossy. That means, by the time you get the string from the lib, you have lost information and can't represent the original data as GBK. Change how the lib works, or change the file meta data to UTF-8.
Upvotes: 1
Reputation: 87386
Yes. You should learn about ruby 1.9's force_encoding
and encode
methods on the string class. I recommend converting everything to actually be UTF-8 as soon as possible before manipulating it in ruby.
Upvotes: 1