AnonymousMe
AnonymousMe

Reputation: 569

More Like This query in ElasticSearch

I'm trying to perform content based recommendation on Amazon Products data and the data is stored in my ElasticSearch Index named 'amazon_products'.

I went through https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#more-like-this-query to use MLT query in ES's python client and on trying it out, I get no response.

Following is my code :

os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.elasticsearch:elasticsearch-hadoop:7.7.1 pyspark-shell'

from elasticsearch import Elasticsearch
es = Elasticsearch()

from elasticsearch_dsl import Search
from elasticsearch_dsl.query import MoreLikeThis

dsl_search = Search(index='amazon_products').using(es)

input_product = 'Ginger'

content_dsl = dsl_search.query(MoreLikeThis(like= input_product, fields=['brand']))

response = content_dsl.execute()
print(response)

for hit in response:
    print(hit)

The response is just empty {} even though there is 'Ginger' under the field 'brand'. Why does this happen ?

After having created the index amazon_products and mapping data into it, I am able to perform ordinary search queries like this :

es.search(index="amazon_products", q="main_category:Refrigerators", size=3)

which seem to work fine and give proper results.

But, I don't understand why MLT won't work on my data. Can someone please help me resolve this ? What should I do ?

How else should I perform an MLT query in Python ?

Upvotes: 0

Views: 643

Answers (1)

shade27
shade27

Reputation: 75

Add these two parameters and it would work fine.

  • min_term_freq = 1
  • min_doc_freq = 1

Like below :

s = s.query(MoreLikeThis(like={"_id": 3006}, fields=['title'],min_term_freq=1,min_doc_freq=1))

Upvotes: 0

Related Questions