defcal
defcal

Reputation: 103

How can I speed up hybrid property queries in SQLAlchemy?

Is there a good way to speed up querying hybrid properties in SQLALchemy that involve relationships? I have the following two tables:

class Child(Base):
     __tablename__ = 'Child'
     id = Column(Integer, primary_key=True) 
     is_boy = Column(Boolean, default=False)
     parent_id = Column(Integer, ForeignKey('Parent.id'))


class Parent(Base):
     __tablename__ = 'Parent'
     id = Column(Integer, primary_key=True) 
     children = relationship("Child", backref="parent")

     @hybrid_property
     def children_count(self):
         return self.children_count.count()

     @children_count.expression
     def children_count(cls):
         return (select([func.count(Children.id)]).
            where(Children.parent_id == cls.id).
            label("children_count")
            )

When I query Parent.children_count across 50,000 rows (each parent has on average roughly 2 children), it's pretty slow. Is there a good way through indexes or something else for me to speed these queries up?

Upvotes: 2

Views: 1405

Answers (1)

Jeff Widman
Jeff Widman

Reputation: 23482

By default, PostgreSQL doesn't create indexes on foreign keys.

So the first thing I'd do is add an index, which SQLAlchemy makes really easy:

parent_id = Column(Integer, ForeignKey('Parent.id'), index=True)

This will probably result in a fast enough retrieval time given the size of your current dataset--try it and see. Be sure to try the query a few times in a row to warm up the PostgreSQL cache.

For a larger dataset, or if the queries still aren't fast enough, you could look into pre-calculating the counts and caching them... A number of ways to cache, the easiest hack is probably throw an extra column in your Parent table and just make sure whenever a new child is added that you write app logic to increment the count. It's a little hacky that way. Another option is caching the count in Redis/memcache, or even using a Materialized View (this is a great solution if it's okay for the count to occasionally be out of date by a few minutes).

Upvotes: 5

Related Questions