pisapapiros
pisapapiros

Reputation: 375

Wagtail Elasticsearch Highlighting

I've implemented a search input on my Wagtail site. It perfectly finds the entries matching my query.

Model:

class BasePage(Page):
    ...

    body = StreamField(...)

    search_fields = Page.search_fields + [
        index.SearchField('body')
    ]

View:

    if search_query:
        search_results = Page.objects.live().search(search_query)
        Query.get(search_query).add_hit()

Template:

{% for result in search_results %}
    <li>
        <h2><a href="{% pageurl result %}">{{ result }}</a></h2>
        {% if result.search_description %}
            {{ result.search_description|safe }}
        {% endif %}
    </li>
{% endfor %}

I don't know how to show a small preview of the matched text. I think that's what Elasticsearch highliting is for, but I can't find the way to implemented it using Wagtail.

Upvotes: 2

Views: 891

Answers (1)

nikitko43
nikitko43

Reputation: 89

You can create your own search backend from default Wagtail Elasticsearch backend

class Elasticsearch5SearchBackend(Elasticsearch2SearchBackend):
    mapping_class = Elasticsearch5Mapping
    index_class = Elasticsearch5Index
    query_compiler_class = Elasticsearch5SearchQueryCompiler
    autocomplete_query_compiler_class = Elasticsearch5AutocompleteQueryCompiler
    results_class = ElasticsearchResults

SearchBackend = Elasticsearch5SearchBackend

Set it as default in settings.py like

WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'apps.search.backend'
    }
}

Then you need to add some changes to results_class Elasticsearch2SearchResults to modify query to Elasticsearch and parse results of that query

class ElasticsearchResults(Elasticsearch2SearchResults):
    fields_param_name = 'stored_fields'

    def _get_es_body(self, for_count=False):
        body = {
            'query': self.query_compiler.get_query(),
        }

        if not for_count:
            body['highlight'] = {
                "pre_tags": ["<span class='search-highlight'>"],
                "post_tags": ["</span>"],
                "fields": {
                  "*": {"require_field_match": False}
                }
              }
            sort = self.query_compiler.get_sort()

            if sort is not None:
                body['sort'] = sort

        return body

    def _get_results_from_hits(self, hits):
        """
        Yields Django model instances from a page of hits returned by Elasticsearch
        """
        # Get pks from results
        pks = [hit['fields']['pk'][0] for hit in hits]
        scores = {str(hit['fields']['pk'][0]): hit['_score'] for hit in hits}
        highlights = {}

        for hit in hits:
            parts = hit['highlight']
            parts.pop('content_type')
            if parts:
                show_part = list(parts.values())[0][0]
            highlights[str(hit['fields']['pk'][0])] = show_part

        # Initialise results dictionary
        results = {str(pk): None for pk in pks}

        # Find objects in database and add them to dict
        for obj in self.query_compiler.queryset.filter(pk__in=pks):
            results[str(obj.pk)] = obj

            if self._score_field:
                setattr(obj, self._score_field, scores.get(str(obj.pk)))
            setattr(obj, '_highlight', highlights.get(str(obj.pk)))

        # Yield results in order given by Elasticsearch
        for pk in pks:
            result = results[str(pk)]
            if result:
                yield result

I'm taking here first of suggested highlights and appending it to result. Then _highlight can be accessed on query results() items.

Upvotes: 1

Related Questions