gtalarico
gtalarico

Reputation: 4699

Elastic Search Suggestions Return Zero Results

Im trying to setup ElasticSearch using the elasticsearch_dsl python library. I have been able to setup the Index, and I am able to search using the .filter() method, but I cannot get the .suggest method to work.

I am trying to use the completion mapping type, and the suggest query method since this is going to be used for an autocomplete field (recommended on elastic's docs).

I am new to elastic, so I am guessing I am missing something. Any guidance will be greatly appreciated!

What I have done so far

I did not find a tutorial that had exactly what I wanted, but I read through the documentation on ElasticSearch.com and elasticsearch_dsl, and looked at some examples hereand here

PS: I am using Searchbox Elasticsearch on Heroku

Index / Mappings Setup:

# imports [...]

edge_ngram_analyzer = analyzer(
    'edge_ngram_analyzer',
    type='custom',
    tokenizer='standard',
    filter=[
        'lowercase',
        token_filter(
            'edge_ngram_filter', type='edgeNGram',
            min_gram=1, max_gram=20
        )
    ]
)

class DocumentIndex(ElasticDocument):
    title = Text()
    title_suggest = Completion(
        analyzer=edge_ngram_analyzer,
        )
    class Index:
        name = 'documents-index'

# [...] Initialize index
# [...] Upload Documents (5,000 documents)
# DocumentIndex.init()
# [DocumentIndex(**doc).save() for doc in mydocs]

Mappings Output:

This is the mapping as shown in the web console:

 {
  "documents-index": {
    "mappings": {
      "doc": {
        "properties": {
          "title": {
            "type": "text"
          },
          "title_suggest": {
            "type": "completion",
            "analyzer": "edge_ngram_analyzer",
            "search_analyzer": "standard",
            "preserve_separators": true,
            "preserve_position_increments": true,
            "max_input_length": 50
          }
        }
      }
    }
  }
}

Attempting to Search

Verify Index exists:

>>> search = Search(index='documents-index')
>>> search.count()  # Returns correct amount of documents
5000
>>> [doc for doc in search.scan()][:3]
>>> [<Hit(documents-index/doc/1): ...} ...

Test Search - Works:

>>> query = search.filter('match', title='class')
>>> query.execute()
>>> result.hits 
<Response: [<Hit(documents-in [ ... ]
>>> len(result.hits)
10
>>> query.to_dict()  # see query payload
{ 
  "query":{
    "bool":{
      "filter":[
        {
          "fuzzy":{
            "title":"class"
          }
        }
      ]
    }
  }
}

The part that fails

I cannot get any of the .suggest() methods to work. Note: * I am following the official library docs

Test Suggest:

>>> query = search.suggest(
        'title-suggestions',
        'class',
        completion={
        'field': 'title_suggest',
        'fuzzy': True
        })
>>> query.execute()
<Response: {}>
>>> query.to_dict() # see query payload
{
  "suggest": {
    "title-suggestions": {
      "text": "class",
      "completion": { "field": "title_suggest" }
    }
  }
}

I also tried the code below, and obviously many different types of queries and values, but the results were similar. (note with .filter() I always get the expected result).

>>> query = search.suggest(
        'title-suggestions',
        'class',
         term=dict(field='title'))
>>> query.to_dict() # see query payload
{
  "suggest": {
    "title-suggestions": { 
        "text": "class", 
        "term": { 
            "field": "title" 
        } 
    }
  }
}
>>> query.execute()
<Response: {}>

Update

Per Honza's suggestion, I updated the title_suggest mapping to be only Completion, with no custom analyzers. I also deleted the index and reindexed from scratch

class DocumentIndex(ElasticDocument):
    title = Text()
    title_suggest = Completion()
    class Index:
        name = 'documents-index'

Unfortunately, the problem remains. Here are some more tests:

Verify title_suggest is being indexed properly

>>> search = Search(index='documents-index)
>>> search.index('documents-index').count()
23369
>>> [d for d in search.scan()][0].title
'AnalyticalGrid Property'
>>> [d for d in search.scan()][0].title_suggest
'AnalyticalGrid Property'

Tried searching again:

>>> len(search.filter('term', title='class').execute().hits)
10
>>> search.filter('term', title_suggest='Class').execute().hits
[]
>>> search.suggest('suggestions', 'class', completion={'field': 
'title_suggest'}).execute().hits
[]

Verify Mapping:

>>> pprint(index.get_mapping())
{
  "documents-index": {
    "mappings": {
      "doc": {
        "properties": {
          "title": { "type": "text" },
          "title_suggest": {
            "analyzer": "simple",
            "max_input_length": 50,
            "preserve_position_increments": True,
            "preserve_separators": True,
            "type": "completion"
          }
        }
      }
    }
  }
}

Upvotes: 0

Views: 1410

Answers (2)

gtalarico
gtalarico

Reputation: 4699

I wanted to formalize the solution which was provided by Honza on one of the comments for another answer.

The problem was not the mapping, but simply the fact that results from the .suggest() method are not returned under hits.

The suggestions are now visible in the dictionary returned by:

>>> response = query.execute()
>>> print(response)
<Response: {}>
>>> response.to_dict()
# output is
# {'query': {},
# 'suggest': {'title-suggestions': {'completion': {'field': 'title_suggest'},
# [...]

I have also found additional details on this github issue:

HonzaKral commented 27 days ago

The Response object provides access to any and all fields that have been returned by elasticsearch. For convenience there is a shortcut that allow to iterate over the hits as that is both most common and also easy to do. For other parts of the response, like aggregations or suggestions, you need to access them explicitly like response.suggest.foo.options.

Upvotes: 0

Honza Kr&#225;l
Honza Kr&#225;l

Reputation: 3022

For completion fields you do not want to be using ngram analyzers. The completion field will automatically index all prefixes and optimize for prefix queries so you are doing the work twice and confusing the system. Start with empty completion field and go from there.

Upvotes: 2

Related Questions