culix
culix

Reputation: 10616

SqlAlchemy: Is there a performance difference building queries over multiple lines or functions?

How efficient is sqlalchemy if I use several query objects, lines of code, or functions to build my final query? Is there a performance hit as it executes each step, or is it smart enough to figure out what's going on and only execute the final query?

e.g. say I have code that does this:

get_shiniest_robots(self, polish_date):
    """Get robot_ids for robots with recently polished spikes."""
    return session.query(Robots.robot_id).filter_by(spike_polish_date > polish_date)

Then I have another function that does

get_dullest_robots(self, polish_date):
    """Get robot_ids for robots that have not been polished lately."""
    return session.query(Robots.robot_id).filter_by(~Robots.robot_id.in_(get_shiniest_robots(polish_date))).values(Robots.robot_id)

Will get_dullest_robots() send two queries to the database instead of one?

Upvotes: 1

Views: 870

Answers (2)

zzzeek
zzzeek

Reputation: 75137

Is there a performance hit as it executes each step, or is it smart enough to figure out what's going on and only execute the final query?

Query has well defined points at which SQL is emitted:

  1. when you iterate over it, that is, "for item in query:"

  2. When you call .all(), .one(). .first(), .scalar(), .values(), .count(), and sometimes .get().

  3. most other methods are what we call "generative", which is also known as "method chaining". The methods listed out at http://docs.sqlalchemy.org/en/latest/orm/query.html should give a hint, if not explicitly, if you're getting back a new Query with additional state, or if you're getting SQL emitted results back.

Upvotes: 3

culix
culix

Reputation: 10616

You can check this for yourself by turning on sqlalchemy echoing and running the queries. When you connect to your database pass the echo parameter with something like create_engine(dbstring, echo=1).

Then run your queries.

get_shiniest_robots:

INFO sqlalchemy.engine.base.Engine SELECT t_robots.robot_id AS t_robots_id FROM t_robots JOIN t_spike_maintenance ON t_robots.robot_id = t_spike_maintenance.robot_id WHERE t_spike_maintenance."spike_polish_date" > ?

get_dullest_robots

INFO sqlalchemy.engine.base.Engine SELECT t_robots.robot_id AS t_robots_id FROM t_robots JOIN t_spike_maintenance ON t_robots.robot_id = t_spike_maintenance.robot_id WHERE t_robots.robot_id NOT IN (SELECT t_robots.robot_id AS t_robots_robot_id FROM t_robots JOIN t_spike_maintenance ON t_robots.robot_id = t_spike_maintenance.robot_id WHERE t_spike_maintenance."pike_polish_date" > ?)

You can see sqlalchemy is creating a subquery rather than executing two separate queries, which seems normal. It is only executing the SQL once.

Upvotes: 1

Related Questions