Nahn
Nahn

Reputation: 3256

Django Haystack - Filter by substring of a field using SearchQuerySet ()

I have a Django project that uses SOLR for indexing.

I'm trying to do a substring search using Haystack's SearchQuerySet class.

For example, when a user searches for the term "ear", it should return the entry that has a field with the value: "Search". As you can see, "ear" is a SUBSTRING of "Search". (obviously :))

In other words, in a perfect Django world I would like something like:

SearchQuerySet().all().filter(some_field__contains_substring='ear')

In the haystack documentation for SearchQuerySet (https://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#field-lookups), it says that only the following FIELD LOOKUP types are supported:

I tried using __contains, but it behaves exactly like __exact, which looks up the exact word (the whole word) in a sentence, not a substring of a word.

I am confused, because such a functionality is pretty basic, and I'm not sure if I'm missing something, or there is another way to approach this problem (using Regex or something?).

Thanks

Upvotes: 7

Views: 4452

Answers (2)

shredding
shredding

Reputation: 5601

It's a bug in haystack.

As you said, __exact is implemented exactly like __contains and therefore this functionality does not exists out of the box in haystack.

The fix is awaiting merge here: https://github.com/django-haystack/django-haystack/issues/1041

You can bridge the waiting time for a fixed release like this:

from haystack.inputs import BaseInput, Clean


class CustomContain(BaseInput):
    """
    An input type for making wildcard matches.
    """
    input_type_name = 'custom_contain'

    def prepare(self, query_obj):
        query_string = super(CustomContain, self).prepare(query_obj)
        query_string = query_obj.clean(query_string)

        exact_bits = [Clean(bit).prepare(query_obj) for bit in query_string.split(' ') if bit]
        query_string = u' '.join(exact_bits)

        return u'*{}*'.format(query_string)

# Usage:
SearchQuerySet().filter(content=CustomContain('searchcontentgoeshere'))

Upvotes: 2

Aamir Rind
Aamir Rind

Reputation: 39689

That could be done using EdgeNgramField field:

some_field = indexes.EdgeNgramField() # also prepare value for this field or use model_attr

Then for partial match:

SearchQuerySet().all().filter(some_field='ear')

Upvotes: 6

Related Questions