Reputation: 56
I'm using haystack with elastic search for a project, but the scores I get make no sense (to me).
The model I'm trying to index and search looks similar to:
class Car(models.Model):
name = models.CharField(max_length=255)
class Color(models.Model):
car = models.ForeignKey(Car)
name = models.CharField(max_length=255)
And the search index, even if I'm interested in cars, I want to search them by color as I want to display a pic of that color specifically:
class CarIndex(indexes.SearchIndex, indexes.Indexable):
text = CharField(document=True)
def get_model(self):
return Color
def prepare_text(self, obj):
# Some cleaning
return " ".join([obj.name, obj.car.name])
Now I add a car with three colors, a LaFerrari in Red, Black and White. Having only one model of car, for search purposes there are 3 cars.
So I check Kibana and I get a normal output.
Then I perform a normal search: LaFerrari
All three models have the same info, changing only the color name on the text field. I've even tried removing the color from the text, and guess what I got.
After this fiasco, I tried the python elasticsearch library, and I got normal results (doing manual index and search), all three colors had the same score if I performed a search for LaFerrari
.
Any idea what is going on?
I'm thinking about moving from haystack to plain elasticsearch, any recommendations?
Upvotes: 1
Views: 504
Reputation: 16666
If you want to search more distinctively you should add two more fields to the index:
white
however you name the models and attributes)The catch-all document field will get you only so far. You would have to make it so that Elasticsearch uses a DisMax query and searches on all configured fields for the given search terms.
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/query-dsl-dis-max-query.html
I've only used the SearchQuerySet
+Elastic (based on the catch-all field) so far (and custom+Solr a lot). While the SearchQuerySet
fits in very nicely with the Django ORM it will only get you so far. So, you are probably right that you might have to use custom code for querying. I would still recommend Haystack for indexing though (it might be slower but very easy to setup and maintain).
Looking at your example, what you gain with different fields would be:
You search for Laferrari
and this is the exact value found in all three documents in the field name
(or brand_name
). The results will then have the same scores.
Different fields also enable you to use facets: https://www.elastic.co/guide/en/elasticsearch/reference/1.7/search-facets.html#search-facets
Upvotes: 1