OXp1845
OXp1845

Reputation: 493

Find location from text

I am currently thinking of how to find a location from a text, such as a blogpost, without the user having to input any additional information. For example a post could look like this:


"Aberdeen, With a Foot on the Seafloor

Since the early 1970s, Aberdeen, Scotland, has evolved from a gritty fishing town into the world’s center of innovation in technology for the offshore energy industry."


By reading it I realize that the post is about Aberdeen Scotland but how can I geotag it? I have been using the geocoder (https://github.com/alexreisner/geocoder) by Alex Reisner but it seems weird to check every word against the google/nominatim(osm). My initial idea was to simply bruteforce it by checking every word with the geocoder and try to see if there are similarities between the words. But it seems like there could be a better way around this.

Has anyone done anything similar to this? Any algorithm that could be suggested (or gem :) ) would be immensely appreciated!

Upvotes: 0

Views: 278

Answers (1)

Christian Stewart
Christian Stewart

Reputation: 15519

I'm sure there have been projects dedicated to this - for example, google's uncanny ability to geotag and pick data out of your personal emails effortlessly.

The most obvious answer I can see here, would be to create a few regular expressions for locations. The most simple one would be for City, Country:

Regexp.new("((?:[a-z][a-z]+))(.)(\\s+)((?:[a-z][a-z]+))",Regexp::IGNORECASE);

This would recognize Aberdeen, Scotland, but also course, I or even thanks, bye. It would be a start though, to query only those recognized spots instead of every word in the document.

There are also widely known regular expressions for addresses, cities, etc. You could use those as well if you find your algorithm missing matches.

Cheers!

Upvotes: 1

Related Questions