Reputation: 3345
We encounter the said error on some of our newer virtual machines, while other machines remain unaffected and wonder why and furthermore how to get rid of them.
the two main differences are as follows
vm_old:
debian squeeze
ruby1.9.2p0
vm_new:
debian wheezy
ruby1.9.2p320 (over rvm)
There naturally are more changes within the VMs, but i don't know which would affect this behavior.
We have a response containing umlauts within some of our controllers (ie. '{"message": "ü"}') and we have set # encoding: utf-8
Within the spec we test the response against a fixed string with this umlaut
it 'should test something' do
get :some_controller, format: :json
response.status.should == 200
json = ActiveSupport::JSON.decode(response.body)
json["message"].should == 'ü' # breaks on this line
# ... some more tests
end
The substitute for ü seems to be a random 4 digit string. On occasion this string seems to be valid utf-8 and can be transfered. We then have a failed spec instead of the error message in the title, since the random string is not the same as ü.
The spec file itself also has the # encoding: utf-8
on the first line.
We tried playing with the locale or with force_encoding('utf-8')
The question now becomes: Has someone else encountered a problem like this? and How to solve it?
Edit: turns out it is not always starting with P\
.
Edit 2:
Testing around showed it is a problem with the json decode.
The controller response is something like "{\"foo\": \"\u00fc\"}"
, decoding that results in random output where the sequence \u00fc
used to be.
for simple reproduction:
bundle exec rails c
> ActiveSupport::JSON.decode(ActiveSupport::JSON.encode({:foo => "ü"})
rails version is 3.0.4
Edit 3: Changing the JSON backend to Yaml seems to be a valid workaround.
Upvotes: 1
Views: 1443
Reputation: 3253
I'm not certain if this will be of help to you, but I figured I'd toss it out there. For me, adding this code:
.encode('UTF-16le', :invalid => :replace, :replace => '').encode('UTF-8')
totally saved me. Essentially, it involves converting your UTF-8 encoding to UTF-16, and then encoding it back to UTF-8. More information is available here.
Upvotes: 1