Reputation: 8006
So I'm trying to write my own geocode database for US and Canada, because I need incredible speed, and no rate limiting. I've got the following algorithm for rails batch geocoding, but I'm wondering if there is a better way to eager load the initial batch of cities. I've been benchmarking, and I've gotten it down to this algorithm which gives me 1000 geocodes in about 19 seconds, with ~50% coverage.
My question is, would there be a better way to operate instead of re-querying the database when trying to "drill down"?
ids = City.where('lower(name) IN (?)', locations).pluck(:id) # Eager load the only possible results
results.find_each do |r|
#next if r.location = 'EXACT'
names = r.location.split(',')
state = get_state(names)
city = City.where(:id => ids, :state => state[0]).where('lower(name) IN (?)', names).first # Drill down to the appropriate state
if city.nil?
city = City.where(:id => ids).where('lower(name) IN (?)', names).first # Hail Mary
end
# Return if nil?
if city.blank?
puts "Oh no! We couldn't find a city for #{r.location}"
else
# Finally, the city
puts "Selected #{city.name} for #{r.location}"
r.latitude = city.latitude
r.longitude = city.longitude
r.save
end
end
Upvotes: 0
Views: 62
Reputation: 8006
Definitely the best improvement I was able to make, because of the sheer volume of the cities the cities, was to hit the database once only.
Rune the .where
query, and then use
array.select { |x| ... }[0]
to filter results. This cut my benchmark down by 3/4. (20 seconds to 4.8 seconds)
Upvotes: 1
Reputation: 13354
The only thing I could think of is checking out find_in_batches and increase your batch size. find_each
defaults to 1000 - I'm guessing you could probably tune that a bit for performance.
Upvotes: 1