Reputation: 5527
I am currently writing a rails app using bleeding edge stuff. Rails3, rSpec2, Ruby 1.9.2 and Geokit 1.5.0. When i try to geocode addresses that have special characters that are not in ASCII-8Bit i get this error:
incompatible character encodings: UTF-8 and ASCII-8BIT
The Trace is like this:
1) Spot Basic Validations should calculate lat and lng
Failure/Error: spot = Spot.create!({
incompatible character encodings: UTF-8 and ASCII-8BIT
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/geokit-1.5.0/lib/geokit/geocoders.rb:435:in `do_geocode'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/geokit-1.5.0/lib/geokit/geocoders.rb:126:in `geocode'
# ./app/models/spot.rb:26:in `geocode_address'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activesupport-3.0.0.rc/lib/active_support/callbacks.rb:409:in `_run_validation_callbacks'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activemodel-3.0.0.rc/lib/active_model/validations/callbacks.rb:53:in `run_validations!'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activemodel-3.0.0.rc/lib/active_model/validations.rb:168:in `valid?'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:55:in `valid?'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:75:in `perform_validations'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:49:in `save!'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/attribute_methods/dirty.rb:30:in `save!'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:242:in `block in save!'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:289:in `block in with_transaction_returning_status'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/connection_adapters/abstract/database_statements.rb:139:in `transaction'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:204:in `transaction'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:287:in `with_transaction_returning_status'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/transactions.rb:242:in `save!'
# /Users/nilsriedemann/.rvm/gems/ruby-1.9.2-rc2/gems/activerecord-3.0.0.rc/lib/active_record/validations.rb:34:in `create!'
# ./spec/models/spot_spec.rb:13:in `block (2 levels) in <top (required)>'
I used # coding: utf-8
in all of my related files (specs, factories and model). Yet i get this error when i use an address like "Elsassers Straße 27".
Any hints? I thought Geokit was already compatible with 1.9.1 and therefore with all this new encoding thing.
Upvotes: 0
Views: 1275
Reputation: 61
Using CGI.escape is not a good idea, as it gives unexpected results. Try "Oslo, Norway" with and without CGI.escape, you'll see what I mean.
A better solution is to use Iconv on the location:
ic = Iconv.new('US-ASCII//IGNORE', 'UTF-8')
utf8location = ic.iconv(location)
Cheers!
EDIT: I had a suggestion by Wes Gamble for a edit here, which I think is relevant:
Using //IGNORE
will remove any non-ASCII characters. But in many (most) cases, you may want to transliterate certain characters such as umlauts (e.g. "Zürich" will become "Zurich") or carons (e.g "Niš" will become "Nis") in order to successfully geocode them. If you ignore non-ASCII characters, then "Zürich" will become "Zrich" and "Niš" will become "Ni", neither of which will successfully geocode.
For this you want to use
ic = Iconv.new('US-ASCII//TRANSLIT', 'UTF-8')
Note that the conversion will throw an exception if the transliteration cannot be completed so make sure you handle that.
Upvotes: 3
Reputation: 1381
CGI.escape seems to be more accurate than Geokit::Inflector::url_escape.
Here are the results of encoding "Elsassers Straße 27"
>> CGI.escape(address)
=> "Elsassers+Stra%C3%9Fe+27"
While
>> Geokit::Inflector::url_escape(address)
=> "Elsassers+Stra%C3e+27"
The letter ß should show as c39F (as per http://www.utf8-chartable.de/unicode-utf8-table.pl)
In addition, debug statement was blowing up (I knew there was a reason to check if debug logging is enabled :)
So, here is my solution for GoogleGeocoder3, I guess others will have a similar problem
module Geokit
module Geocoders
class GoogleGeocoder3 < Geocoder
def self.do_geocode(address, options = {})
bias_str = options[:bias] ? construct_bias_string_from_options(options[:bias]) : ''
address_str = address.is_a?(GeoLoc) ? address.to_geocodeable_s : address
#use CGI.escape instead of Geokit::Inflector::url_escape
url ="http://maps.google.com/maps/api/geocode/json?sensor=false&address=#{CGI.escape(address_str)}#{bias_str}"
res = self.call_geocoder_service(url)
return GeoLoc.new if !res.is_a?(Net::HTTPSuccess)
json = res.body
# escape results of json
logger.debug "Google geocoding. Address: #{address}. Result: #{CGI.escape(json)}"
return self.json2GeoLoc(json, address)
end
end
end
end
Upvotes: 1
Reputation: 11543
I had the same problem and I solved this by adding CGI.escape() like this:
geo = Geokit::Geocoders::MultiGeocoder.geocode(CGI.escape(address))
Upvotes: -1
Reputation: 5336
I know it a very very late answer, but I have written a Google geocoder for the Geokit gem that handles all of this Incompatibility errors. This Geocoder uses the newest V3 API of Google's geocoding service. The advantage is that now it does not parse XML but rather JSON which is faster, paired with the required gem Yajl (a super fast json parser for ruby) is way faster. My benchmarks show about 1.5x times faster than the old way.
https://github.com/rubymaniac/geokit-gem
Upvotes: 0