coredumperror
coredumperror

Reputation: 9100

How do you filter search results in Wagtail based on a ManyToManyField?

I have a Wagtail site which defines an Event model. These Events have multiple Event Sponsors, which are associated by a ManyToManyField on the EventSponsor model:

class Event(index.Indexed, ClusterableModel):

    title       = models.CharField(max_length=255)
    start_date  = models.DateTimeField()
    end_date    = models.DateTimeField(null=True, blank=True)
    description = RichTextField(blank=True)

    search_fields = [
        index.SearchField('title', partial_match=True, boost=2.0),
        index.SearchField('description'),
        index.RelatedFields('sponsors', [
            index.SearchField('name', partial_match=True)
        ]),

        index.FilterField('end_date'),
        index.FilterField('sponsors'),
    ]

class EventSponsor(index.Indexed, models.Model):

    sponsor_id = models.IntegerField()
    name = models.CharField(max_length=255)
    url = models.URLField(blank=True)

    events = models.ManyToManyField(Event, related_name='sponsors')

    search_fields = [
        index.SearchField('name', partial_match=True),
    ]

In addition to this, different Sites on my Wagtail server include Events in their calendar based on a set of selected Event Sponsors specific to that site.

So building the calendar listing queryset for each site looks like this:

def get_events_for_current_site(request, listing):
    try:
        event_sponsor_settings = EventSponsorSettings.objects.get(site=request.site)
    except EventSponsorSettings.DoesNotExist:
        # If there's no EventSponsorSettings for this Site, return an empty QuerySet. This shouldn't really ever happen.
        return Event.objects.none()

    # Return the selected Events in decending order of start date.
    query = Event.objects.filter(sponsors__in=event_sponsor_settings.selected_event_sponsors)
    if listing == 'upcoming_events':
        return query.order_by('start_date').filter(end_date__gte=timezone.now())
    else:
        return query.order_by('-start_date').filter(end_date__lt=timezone.now())

event_sponsor_settings.selected_event_sponsors is a list of EventSponsor objects. This queryset works just fine for the listing pages.

I need the search functionality (using the Elasticsearch backend) on each Site to include only the Events which would appear on the current Site's calendar. So I want my base queryset to be the same one used by the calendar pages (or to at least do the same filtering). So my Event search code basically calls:

backend.search(search_query, get_events_for_current_site())

However, I've run into two problems:

1) If I use index.FilterField('sponsors') in Event.search_fields, I get this error when I run manage.py update_index:

Traceback (most recent call last):
  File "./manage.py", line 33, in <module>
    execute_from_command_line(argv)
  File "/multitenant-ve/lib/python2.7/site-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
    utility.execute()
  File "/multitenant-ve/lib/python2.7/site-packages/django/core/management/__init__.py", line 345, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/multitenant-ve/lib/python2.7/site-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/multitenant-ve/lib/python2.7/site-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/management/commands/update_index.py", line 120, in handle
    self.update_backend(backend_name, schema_only=options.get('schema_only', False))
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/management/commands/update_index.py", line 77, in update_backend
    index.add_model(model)
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py", line 536, in add_model
    index=self.name, doc_type=mapping.get_document_type(), body=mapping.get_mapping()
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py", line 137, in get_mapping
    self.get_field_mapping(field) for field in self.model.get_search_fields()
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py", line 137, in <genexpr>
    self.get_field_mapping(field) for field in self.model.get_search_fields()
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py", line 119, in get_field_mapping
    return self.get_field_column_name(field), mapping
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py", line 72, in get_field_column_name
    return field.get_attname(self.model) + '_filter'
  File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/index.py", line 178, in get_attname
    return field.attname
AttributeError: 'ManyToManyRel' object has no attribute 'attname'

2) If I take out index.FilterField('sponsors'), manage.py update_index works, but I get an error when I search:

Cannot filter search results with field "eventsponsor_id". Please add index.FilterField('eventsponsor_id') to Event.search_fields.

So I tried adding index.FilterField('eventsponsor_id'), bit it gives this warning during update_index: Event.search_fields contains field 'eventsponsor_id' but it doesn't exist, and causes this traceback at search time:

Traceback:
File "/multitenant-ve/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
  174.                     response = self.process_exception_by_middleware(e, request)
File "/multitenant-ve/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
  172.                     response = response.render()
File "/multitenant-ve/lib/python2.7/site-packages/django/template/response.py" in render
  160.             self.content = self.rendered_content
File "/multitenant-ve/lib/python2.7/site-packages/django/template/response.py" in rendered_content
  137.         content = template.render(context, self._request)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/backends/django.py" in render
  95.             return self.template.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render
  206.                     return self._render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in _render
  197.         return self.nodelist.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render
  992.                 bit = node.render_annotated(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render_annotated
  959.             return self.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/loader_tags.py" in render
  173.         return compiled_parent._render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in _render
  197.         return self.nodelist.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render
  992.                 bit = node.render_annotated(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render_annotated
  959.             return self.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/loader_tags.py" in render
  173.         return compiled_parent._render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in _render
  197.         return self.nodelist.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render
  992.                 bit = node.render_annotated(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render_annotated
  959.             return self.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/loader_tags.py" in render
  69.                 result = block.nodelist.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render
  992.                 bit = node.render_annotated(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render_annotated
  959.             return self.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/defaulttags.py" in render
  220.                     nodelist.append(node.render_annotated(context))
File "/multitenant-ve/lib/python2.7/site-packages/django/template/base.py" in render_annotated
  959.             return self.render(context)
File "/multitenant-ve/lib/python2.7/site-packages/django/template/defaulttags.py" in render
  325.             if match:
File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/base.py" in __len__
  174.         return len(self.results())
File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/base.py" in results
  137.             self._results_cache = self._do_search()
File "/multitenant-ve/src/wagtail/wagtail/wagtailsearch/backends/elasticsearch.py" in _do_search
  452.         hits = self.backend.es.search(**params)
File "/multitenant-ve/lib/python2.7/site-packages/elasticsearch/client/utils.py" in _wrapped
  69.             return func(*args, params=params, **kwargs)
File "/multitenant-ve/lib/python2.7/site-packages/elasticsearch/client/__init__.py" in search
  531.             doc_type, '_search'), params=params, body=body)
File "/multitenant-ve/lib/python2.7/site-packages/elasticsearch/transport.py" in perform_request
  273.             body = self.serializer.dumps(body)
File "/multitenant-ve/lib/python2.7/site-packages/elasticsearch/serializer.py" in dumps
  47.             raise SerializationError(data, e)

Exception Type: SerializationError at /search
Exception Value: ({u'query': {u'filtered': {u'filter': {u'and': [{u'prefix': {u'content_type': u'event'}}, {'and': [{u'terms': {u'eventsponsor_id_filter': [<EventSponsor: Division of Geological and Planetary Sciences (9003)>]}}, {u'range': {u'end_date_filter': {'gte': datetime.datetime(2017, 3, 29, 0, 42, 7, 462939, tzinfo=<UTC>)}}}]}]}, u'query': {u'multi_match': {u'query': u'geo', u'fields': [u'_all', u'_partials']}}}}}, TypeError("Unable to serialize <EventSponsor: Division of Geological and Planetary Sciences (9003)> (type: <class 'templated_cms.models.events.EventSponsor'>)",))

So, I tried changing the queryset in get_events_for_current_site() to Event.objects.filter(sponsors__id__in=[s.id for s in event_sponsor_settings.selected_event_sponsors])

This fixes the error... but I get no search results at all.

I'm entirely stumped on how to deal with this. :(

Upvotes: 2

Views: 3927

Answers (3)

ThijsG2P
ThijsG2P

Reputation: 11

I had a similar issue with searching/filtering text and manytomanyfields in a form.

The way I solved it:

  1. perform ElasticSearch on model.
  2. convert results (DatabaseSearchResults) to a queryset.
  3. apply manytomany filters to queryset with data from form

e.g.,

results = MyModel.search(search_terms, fields=['title', 'body'], operator='or')
qs = results.get_queryset()

m2m_objects = self.cleaned_data.get('m2m_field')
qs = qs.filter(m2m_field__in=m2m_objects)

Upvotes: 1

coredumperror
coredumperror

Reputation: 9100

For those who run across this question in the future, here's what I ultimately did to solve this (the code has been cut down to the bare minimum for displaying the mechanism I used):

class Event(index.Indexed, ClusterableModel):

    title       = models.CharField(max_length=255)
    start_date  = models.DateTimeField()
    end_date    = models.DateTimeField(null=True, blank=True)
    description = RichTextField(blank=True)
    lecture_series = models.ForeignKey(
        'this_app.LectureSeries', null=True, blank=True, related_name='events', 
        on_delete=models.SET_NULL
    )

    search_fields = [
        ...
        # We use a Filterfield on lecture_series here because we apparently can't do it 
        # on lecture_series_id for whatever reason. This means we need to filter Events
        # on their lecture_series directly on all querysets that will get used as a 
        # search filter.
        index.FilterField('lecture_series'),
        # We can't filter directly on a ManyToMany relationship, so we need to be a bit
        # creative. This uses the sponsor_id() method defined below to add our 
        # EventSponsors' sponsor_ids to the search index.
        index.FilterField('sponsor_id'),
    ]

    def sponsor_id(self):
        """
        Adds all of our EventSponsors' sponsor_ids to the search filter list.
        """
        return list(self.sponsors.all().values_list('sponsor_id', flat=True))


class EventSponsor(index.Indexed, models.Model):

    sponsor_id = models.IntegerField()
    name = models.CharField(max_length=255)

    events = models.ManyToManyField(Event, related_name='sponsors')

    search_fields = [
        index.SearchField('name', partial_match=True),
    ]


class LectureSeries(index.Indexed, models.Model):

    lecture_series_id = models.IntegerField(unique=True)
    name = models.CharField(max_length=255)

    search_fields = [
        index.SearchField('name', partial_match=True),
    ]


def get_base_events_queryset_for_site(site):
    """
    Returns the base queryset object from which all Event listings for a spcified Site
    must be derived.
    This function filters the list of Event objects down to just those that the Site's
    admins have chosen to display.
    """
    try:
        settings = EventListingSettings.objects.get(site=site)
    except EventListingSettings.DoesNotExist:
        # If there's no EventListingSettings for this Site, return an empty QuerySet.
        return Event.objects.none()

    # We need to do the sponsors via their sponsor_ids because searches can't be filtered
    # directly on a ManyToMany relationship.
    sponsor_ids = [sponsor.sponsor_id for sponsor in settings.event_sponsors.all()]

    # We need to split these up for Sites which import either no LectureSeries or no 
    # EventSponsors. Listings will get dupes, and searches will crash if we don't.
    if settings.lecture_series.exists() and sponsor_ids:
        queryset = Event.objects.filter(
            Q(sponsors__sponsor_id__in=sponsor_ids) | 
            Q(lecture_series__in=settings.lecture_series.all())
        )
    elif sponsor_ids:
        queryset = event_model.objects.filter(sponsors__sponsor_id__in=sponsor_ids)
    else:
        queryset = event_model.objects.filter(lecture_series__in=settings.lecture_series.all())

    return queryset

As you can see, regular ForeignKeys can be used for filtering searches as normal, but the ManyToMany relationships need some special ID listing code to make it possible to build a queryset that can be translated into an ElasticSearch query.

Upvotes: 1

amhen
amhen

Reputation: 33

For starters, this post helped me solve this issue, so thank you for that.

FilterFields are great for running filters on your search results. In this case, we only really need to build search results off of the filtered queryset.

My way around this is the following:

  1. Gather the IDs of the events you want to form search results off of.

    event_ids = get_events_for_current_site().values_list('id', flat=True)
    
  2. Build a new queryset based off of those ids.

    filtered_events = Event.objects.filter(id__in=event_ids)
    
  3. Pass the new queryset to your search

    backend.search(search_query, filtered_events)
    

Since the queryset passed into search is being filtered off of id, you will need to include index.FilterField('id') in Event.search_fields and update your index.

Do note that I have not tested the reported code specifically but rather my own variation.

In addition, this Wagtail Support post gave me some insight into solving this: https://groups.google.com/forum/#!msg/wagtail/k2-E4h2oLtI/uPOzbuwKBgAJ

This post does have a word of caution stating that using this approach "shouldn't hit performance too badly as long as you don't have 1000s of [items]".

Upvotes: 3

Related Questions