Rishabh Sharma
Rishabh Sharma

Reputation: 1

How to use a combination of models' data (many to many field) as a queryset and ingest using django-elasticsearch-dsl?

I have two models

  1. Author
  2. Book

My query is - I want to ingest my model data into Elasticsearch using django_elasticsearch_dsl, using pairs for Author & Book models

This is my index class and fields using the elasticsearch_dsl

from elasticsearch_dsl import Document

class AuthorBookPair(Document)
    author = Text()
    book = Text()
    isbn = Text()

class Index:
        name = "author-book-pair"
        settings = {"number_of_shards": 1, "number_of_replicas": 0}

if I would use the above class for mapping, my indexing queryset & code would look like this

    authors = Author.objects.all()
    
    for author in authors:
        for book in author.books.all():
            pair = AuthorBookPair()
            pair.book = book
            pair.isbn = book.isbn
            pair.save()

However I want to use django_elasticsearch_dsl whose index requires model definition in Django class like the example below

class AuthorBookPair(Document):
    author = fields.TextField()
    book = fields.TextField()
    isbn = fields.TextField()

    class Index:
        name = "author-book-pair"
        settings = {"number_of_shards": 1, "number_of_replicas": 0}

    class Django:
        model = Author 

    def get_queryset(self):
        Author.objects.all()

    def prepare_author(self,instance):
        return instance.author

How can I imitate the elasticsearch_dsl queryset combination and logic django_elasticsearch_dsl?

I tried using a for loop outside of the AuthorBookPair Class as a generator to later call while populating the field, but my Author query-set count and generator query-set count will mismatch, resulting in early termination

Is it even possible?

I tried using a for loop outside of the AuthorBookPair Class as a generator to later call while populating the field, but my Author query-set count and generator query-set count will mismatch, resulting in early termination

Is it even possible?

Upvotes: 0

Views: 74

Answers (1)

Nijat Mammadov
Nijat Mammadov

Reputation: 84

My model classes:

class Author(models.Model):
    full_name = models.CharField(max_length=50, unique=True)

class Book(models.Model):
    title = models.CharField(max_length=200)
    author = models.ManyToManyField(Author, related_name='books')

My Book document:

@registry.register_document
class BookDocument(Document):
    author = fields.ListField(fields.ObjectField(properties={
        "id": fields.KeywordField(),
        "full_name": fields.TextField()
    }))
    
    class Index:
        name = 'book'
        settings = {
            'number_of_shards': 1,
            'number_of_replicas': 0,
        }

    class Django:
        model = Book
        fields = [
            ...
        ]
        related_models = [Author]

    def get_instances_from_related(self, related_instance):
        """If related_models is set, define how to retrieve the Book instance(s) from the related model.
        The related_models option should be used with caution because it can lead in the index
        to the updating of a lot of items.
        """
        if isinstance(related_instance, Author):
            return related_instance.books.all()

This code worked for me. It stores the data as a list. And when you update the value in Author model, BookDocument is also being updated as well.

Upvotes: 0

Related Questions