rrmerugu
rrmerugu

Reputation: 1896

How to improve 2 million data query speed in Django RESTful APIs

I have a scientific research publications data of 2 Million records. I used django restframework to write apis for searching the data in title and abstract. This is taking me 12 seconds while using postgres as db, but if I used MongoDB as db, it goes down to 6seconds.

But even 6 seconds sounds a lot of waiting for user to me. I indexed the title and abstract, but abstract indexing failed because some of the abstract texts are too lengthy.

Here is the django Model using MongoDB(MongoEngine as ODM):

class Journal(Document):
    title = StringField()
    journal_title = StringField()
    abstract = StringField()
    full_text = StringField()
    pub_year = IntField()
    pub_date = DateTimeField()
    pmid = IntField()
    link = StringField()

How do I improve the query performance, what stack makes the search and retrieval more faster?.

Upvotes: 3

Views: 1428

Answers (1)

fixmycode
fixmycode

Reputation: 8506

Some pointers about optimisation for the Django ORM with Postgres:

  • Use db_index=True on fields that will be search upon often and have some degree of repetition between entries, like "title".
  • Use values() and values_list() to select only the columns you want from a QuerySet.
  • If you're doing full text search in any of those columns (like a contains query), bear in mind that Django has support for full text search directly on a Postgres database.
  • Use print queryset.query to check what kind of SQL query is going into your database and if it can be improved upon.
  • Many Postgres optimisation techniques rely in custom SQL queries that can be made in Django by using RawSQL expressions.
  • Remember that there are many, many ways to search for data in a database, be it relational or not-relational in nature. In your case, MongoDB is not "faster" than Postgres, it's just doing a better job at querying what you really want.

Upvotes: 4

Related Questions