Reputation: 15632
I have a problem with location queries returning erroneous results in ElasticSearch.
In our system, a business search engine, every search takes two inputs: a location, and a query-string, e.g.
q=sushi
location=Greenwich Village, New York, New York
I want the search to show me sushi in Greenwich Village first, then sushi outside of Greenwich Village, but to never show me non-sushi results.
The problem is, because of the location
query, anything in Greenwich Village gets matched -- lawyers, doctors, whatever. I'd like say the following to ElasticSearch:
If q matches, then location doesn't have to (it's OK to return sushi outside of Greenwich Village), but if location matches, don't return it unless q matches also (not OK to return non-sushi businesses in Greenwich Village).
Anyone have any thoughts on how to do this?
Upvotes: 0
Views: 207
Reputation: 15632
I incorporated the information on this post and here to to come up with the following solution.
Index every 'place' (neighborhood, city, etc) with a center-point, and also index the coordinates of every business.
Index the place ids attached to the businesses that contain them.
Use a sub-search to convert the text entered into the location bar to a place record.
Use a CustomScoreQuery to modify every result's score by the following formula, which was worked out by trial and error:
new_score = old_score / (1 + distance_between_place_centerpoint_and_result)^3
Also query the place id that results from 3 against the place_ids field as a 'should' boolean query. This gives a flat boost to everything that actually falls within the confines of the specified place.
A side effect of this strategy is that businesses near the center point of the place are considered more relevant -- it is arguable, in my opinion, whether this is correct or not. But other than that it has worked quite well.
Thanks to imitov for his insight that helped me come up with this solution.
Upvotes: 0
Reputation: 30163
It sounds like you want to search for "sushi" (you don't want non-sushi results) but sort your results by location (you want Greenwich Village results first).
If you are storing locations as geo points, you can simply use distance to sort your results.
If location is just a field, and you can only know if the business is inside or outside of a location, you can use Custom Filters Score query to boost relevancy of the results in the desired location. The query
part should contain the search for "sushi" and the filters
part should contain the search for location.
Upvotes: 2