Manzurul Hoque Rumi
Manzurul Hoque Rumi

Reputation: 3094

Django_elasticsearch_dsl_drf not returning expected result

I was applying elastic search in one my django app, Below is my code snippets

documents.py

ads_index = Index("ads_index")
ads_index.settings(
    number_of_shards=1,
    number_of_replicas=0
)

html_strip = analyzer(
    'html_strip',
    tokenizer="standard",
    filter=["standard", "lowercase", "stop", "snowball"],
    char_filter=["html_strip"]
)


@ads_index.doc_type
class AdDocument(Document):
    id = fields.IntegerField(attr='id')

    title = fields.TextField(
    analyzer=html_strip,
    fields={
        'title': fields.TextField(analyzer='keyword'),
    }
   )

   description = fields.TextField(
    analyzer=html_strip,
    fields={
        'description': fields.TextField(analyzer='keyword'),
    }
   )

   category = fields.ObjectField(
    properties={
        'title': fields.TextField(),
    }
)

class Django:
    model = Ad  # The model associated with this Document

    # The fields of the model you want to be indexed in Elasticsearch
    fields = [
        'price',
        'created_at',
    ]

    related_models = [Category]

def get_queryset(self):
    return super().get_queryset().select_related('category')

def get_instances_from_related(self, related_instance):
    if isinstance(related_instance, Category):
        return related_instance.ad_set.all()

serializer

class AdDocumentSerializer(DocumentSerializer):
    class Meta:
        document = AdDocument
        fields = (
            "id",
            "title",
            "description",
            "price",
            "created_at",
        )

viewset

class AdViewSet(DocumentViewSet):
    document = AdDocument
    serializer_class = AdDocumentSerializer
    ordering = ('id',)
    lookup_field = 'id'

    filter_backends = [
        DefaultOrderingFilterBackend,
        FilteringFilterBackend,
        CompoundSearchFilterBackend,
        SuggesterFilterBackend,
    ]

    search_fields = (
        'title',
        'description',
    )

    filter_fields = {
        'id': {
            'field': 'id',
            'lookups': [
                LOOKUP_FILTER_RANGE,
                LOOKUP_QUERY_IN,
                LOOKUP_QUERY_GT,
                LOOKUP_QUERY_GTE,
                LOOKUP_QUERY_LT,
                LOOKUP_QUERY_LTE,
            ],
        },
        'title': 'title.raw',
        'description': 'description.raw',
    }

    ordering_fields = {
        'id': 'id',
    }

Below is my data I have

Data

When I hit http://127.0.0.1:8000/ads/search/?search=Tit it's not returning anything but when I hit http://127.0.0.1:8000/ads/search/?search=a it's giving me one result.

What's wrong here with my code? Any help would be appreciated :-)

Upvotes: 0

Views: 976

Answers (1)

Lupanoide
Lupanoide

Reputation: 3212

With keyword analyzer the entire input string is indicized into inverted index as it is, so you could search only by exact match. I use elasticsearch library in python and I don't know very well elasticsearch-dsl. I will try to answer to you using pure elastic configuration, and then you should search how to implement that conf with elasticsearch-dsl library in python. If you would search also partial string inside some words, you should indicize them with the edge-ngram token filter - doc here. With this method you could perform also an autocompletion on search bar, because partial string from the beginning of the word are searchable. You should implement a specific search_analyzer because you want that your input query string don't be tokenized with edge-ngram token filter - for an explanation have a look here and here

{
    "settings": {
        "number_of_shards": 1, 
        "analysis": {
            "filter": {
                "autocomplete_filter": { 
                    "type":     "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete": {
                    "type":      "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter" 
                    ]
                }
            }
        }
    },
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer" : "autocomplete",
            "search_analyzer" : "standard"
          },
          "description": {
            "type": "text",
            "analyzer" : "autocomplete",
            "search_analyzer" : "standard"
          }
        }
      }
    }

If you don't like this solution because you would also search for partial string in the middle of a word - such as querying for itl to retrieve title string, you should implement a new ngram-tokenizer from scratch - doc here:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "my_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 1,
          "max_gram": 20,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    }
  },
      "mappings": {
        "properties": {
          "title": {
            "type": "text",
            "analyzer" : "autocomplete",
            "search_analyzer" : "standard"
          },
          "description": {
            "type": "text",
            "analyzer" : "autocomplete",
            "search_analyzer" : "standard"
          }
        }
      }
    } 

Upvotes: 1

Related Questions