Adam
Adam

Reputation: 3138

Main field name (document=True)

Django Haystack docs say:

**Warning**
When you choose a document=True field, it should be consistently named across all of your SearchIndex classes to avoid confusing the backend. The convention is to name this field text.

There is nothing special about the text field name used in all of the examples. It could be anything; you could call it pink_polka_dot and it won’t matter. It’s simply a convention to call it text.

But I don't get what it means. This is their example model:

import datetime from haystack import indexes from myapp.models import Note

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    author = indexes.CharField(model_attr='user')
    pub_date = indexes.DateTimeField(model_attr='pub_date')

    def get_model(self):
        return Note

    def index_queryset(self, using=None):
        """Used when the entire index for model is updated."""
        return self.get_model().objects.filter(pub_date__lte=datetime.datetime.now())

Is the text I quoted referring to MY model main field and saying I should call it "text" or to the class defined in search_indexes.py?

If to the class in search_indexes.py, where is the field name it's attached to in the example above? It doesn't have model_attr!

text = indexes.CharField(document=True, use_template=True)

And if to my actual app models, how I am expected to refactor a project with many many apps to call their main text field "text"!

Please advise. Thanks.

Upvotes: 1

Views: 698

Answers (1)

bennylope
bennylope

Reputation: 1163

Your SearchIndex definition does not need to reflect your model definition, it needs to map data from different models to a common search document.

  1. Why does the text field need to be named consistently?
  2. How is map content sourced? (Why is there no model_attr keyword)

The Haystack documentation is advising that your SearchIndex field should be named consistently across your SearchIndex definitions - not that your model fields need to be named consistently. There's a major distinction between the search index definitions and model definitions. You do not need to and probably should not worry about a 1-1 mapping between model fields and search fields.

Step back from your models and think first about what you want to search. Will you be searching several different models through a common search view? Let's say your have two models:

class Note(models.Model):
    title = models.CharField(max_length=40)
    body = models.TextField()

class Memo(models.Model):
    subject = models.CharField(max_length=50)
    content = models.TextField()
    author = models.ForeignKey(StaffMember)

We want to create a simple search view that searches only the primary content of the model as well as the title or name of the content object (name, title, subject, etc.).

Here's a bad example (do NOT do this):

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
    body = indexes.CharField(document=True, use_template=True)
    title = indexes.CharField(model_attr='title')

    def get_model(self):
        return Note

class MemoIndex(indexes.SearchIndex, indexes.Indexable):
    content = indexes.CharField(document=True, use_template=True)
    subject = indexes.CharField(model_attr='subject')

    def get_model(self):
        return Memo

In this bad example, each search index does define a primary content field and a content name field (title or subject). But how do you search it now? If you run a query against content based on the subject you'll miss Note content, and similarly if you query against the body.

Better example (do this):

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    title = indexes.CharField(model_attr='title')

    def get_model(self):
        return Note

class MemoIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    title = indexes.CharField(model_attr='subject')

    def get_model(self):
        return Memo

Note that the field names do not necessarily match the model field names. You just define which model attribute from which the SearchIndex field should source its data.

You search documents in the search engine, not rows in the database, so the SeachIndex definition maps content from the database (one table or a query over many) to a search document. The SearchIndex definition is a transformation, and each SearchField transforms data as you specify.

As to your question about the missing model_attr, that's just one way to pull in the content. You can also render textual content from a template, which is what the text field above does (see the SearchField API documentation on that one). The model_attr source works well for simple character fields.

Upvotes: 7

Related Questions