Reputation: 103
Is there a good way to speed up querying hybrid properties in SQLALchemy that involve relationships? I have the following two tables:
class Child(Base):
__tablename__ = 'Child'
id = Column(Integer, primary_key=True)
is_boy = Column(Boolean, default=False)
parent_id = Column(Integer, ForeignKey('Parent.id'))
class Parent(Base):
__tablename__ = 'Parent'
id = Column(Integer, primary_key=True)
children = relationship("Child", backref="parent")
@hybrid_property
def children_count(self):
return self.children_count.count()
@children_count.expression
def children_count(cls):
return (select([func.count(Children.id)]).
where(Children.parent_id == cls.id).
label("children_count")
)
When I query Parent.children_count across 50,000 rows (each parent has on average roughly 2 children), it's pretty slow. Is there a good way through indexes or something else for me to speed these queries up?
Upvotes: 2
Views: 1405
Reputation: 23482
By default, PostgreSQL doesn't create indexes on foreign keys.
So the first thing I'd do is add an index, which SQLAlchemy makes really easy:
parent_id = Column(Integer, ForeignKey('Parent.id'), index=True)
This will probably result in a fast enough retrieval time given the size of your current dataset--try it and see. Be sure to try the query a few times in a row to warm up the PostgreSQL cache.
For a larger dataset, or if the queries still aren't fast enough, you could look into pre-calculating the counts and caching them... A number of ways to cache, the easiest hack is probably throw an extra column in your Parent table and just make sure whenever a new child is added that you write app logic to increment the count. It's a little hacky that way. Another option is caching the count in Redis/memcache, or even using a Materialized View (this is a great solution if it's okay for the count to occasionally be out of date by a few minutes).
Upvotes: 5