jassinm
jassinm

Reputation: 7509

haystack multiple field search

Hi i m using haystack with a woosh as search engine:

my model looks as follows

class Person(models.Model):
    personid = models.IntegerField(primary_key = True, db_column = 'PID')  
    firstname = models.CharField(max_length = 50, db_column = 'FIRSTNAME')  
    lastname = models.CharField(max_length = 50, db_column = 'LASTNAME') 
    class Meta:
        db_table = '"TEST"."PERSON"'
        managed = False


class TDoc(models.Model):
    tdocid = models.IntegerField(primary_key = True, db_column = 'TDOCID')  
    person = models.ForeignKey(Person, db_column = 'PID')
    content = models.TextField(db_column = 'CONTENT', blank = True) 
    filepath = models.TextField(db_column = 'FILEPATH', blank = True) 
    class Meta:
        db_table = '"TEST"."TDOC"'
        managed = False

The search_index.py is as follows:

class TDocIndex(SearchIndex):

    content = CharField(model_attr = 'content', document = True)
    filepaht = CharField(model_attr = 'filepath')
    person = CharField(model_attr = 'person')

    def get_queryset(self):
        return TDoc.objects.all()

    def prepare_person(self, obj):
        # Store a list of id's for filtering
        return obj.person.lastname

site.register(TDoc, TDocIndex)

My problem is i would like to do multiple filed searches like

content:xxx AND person:SMITH

On haystack it search all of them at once i can't do specific field search. I suspected that my index is corrupt but:

ix = open_dir("/testindex")

searcher = ix.searcher()

mparser = MultifieldParser(["content", "filepath", "person"], schema = ix.schema)
myquery = mparser.parse(content:xxx AND person:SMITH')
results = searcher.search(myquery)
for result in results:
    print result

but it works and return's the correct value. I m using the standard haystack SearchView,search.html from the tutorial

(r'^search/', include('haystack.urls')),

Upvotes: 5

Views: 3699

Answers (2)

Rufat
Rufat

Reputation: 702

You can use the prepare method in your index class like this:

from apps.main.models import Person
from haystack import indexes


class PersonIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True)
    lastname = indexes.CharField(null=True)

    date_insert = indexes.DateTimeField(model_attr="date_insert", indexed=False)
    date_update = indexes.DateTimeField(model_attr="date_update", indexed=False)

    def get_model(self):
        return Person

    def get_updated_field(self):
        return "date_update"

    def index_queryset(self, using="default"):
        return self.get_model().objects.all()

    def prepare(self, obj: Person):
        data = super().prepare(obj)

        main_fields = [obj.content, obj.filepath, obj.person.lastname]
        data["text"] = "\n".join(f"{col}" for col in main_fields)
        data["lastname"] = obj.person.lastname.lower()

        return data

Upvotes: 0

Facundo Olano
Facundo Olano

Reputation: 2609

In your index you should define one field with document=True, which is the document haystack will search on. By convention this field is named text. You add extra fields if you plan to do filtering or ordering on their values.

The way to take several fields in account when you perform a search, is to define the document as a template, and set use_template on your document field. Your Index would look like:

class TDocIndex(SearchIndex):

    text = CharField(document=True, use_template=True)

    #if you plan to filter by person
    personid = IntegerField(model_attr='person__id') 

site.register(TDoc, TDocIndex)

And you'd have a search/indexes/tdoc_text.txt template like:

{{ object.content }}
{{ object.filepath }}
{{ object.person.lastname }}

See this answer.

Upvotes: 5

Related Questions