Ben G
Ben G

Reputation: 26771

Why is Rails ActiveRecord hitting the database twice?

@integration = Integration.first(:conditions=> {:integration_name => params[:integration_name]}, :joins => :broker, :select => ['`integrations`.*, `brokers`.*'])
$stderr.puts @integration.broker.id # This line causes Brokers to be queried again

Results in:

Integration Load (0.4ms)   SELECT `integrations`.*, `brokers`.* FROM `integrations` INNER JOIN `brokers` ON `brokers`.id = `integrations`.broker_id WHERE (`integrations`.`integration_name` = 'chicke') LIMIT 1
Integration Columns (1.5ms)   SHOW FIELDS FROM `integrations`
Broker Columns (1.6ms)   SHOW FIELDS FROM `brokers`
Broker Load (0.3ms)   SELECT * FROM `brokers` WHERE (`brokers`.`id` = 1) 

Any ideas why Rails would hit the databse again for brokers even though I already joined/selected them?

Here are the models (Broker -> Integration is a 1-to-many relationship). Note that this is incomplete, and I have only included the lines that establish their relationship

class Broker < ActiveRecord::Base

  # ActiveRecord Associations
  has_many :integrations

class Integration < ActiveRecord::Base

  belongs_to :broker

I'm using Rails/ActiveRecord 2.3.14, so keep that in mind.

When I do Integration.first(:conditions=> {:integration_name => params[:integration_name]}, :include => :broker) that line causes two SELECTs

Integration Load (0.6ms)   SELECT * FROM `integrations` WHERE (`integrations`.`integration_name` = 'chicke') LIMIT 1
  Integration Columns (2.4ms)   SHOW FIELDS FROM `integrations`
  Broker Columns (1.9ms)   SHOW FIELDS FROM `brokers`
  Broker Load (0.3ms)   SELECT * FROM `brokers` WHERE (`brokers`.`id` = 1) 

Upvotes: 2

Views: 1838

Answers (2)

Harish Shetty
Harish Shetty

Reputation: 64363

Use include instead of joins to avoid reloading Broker object.

Integration.first(:conditions=>{:integration_name => params[:integration_name]}, 
  :include => :broker)

There is no need to give the select clause as you are not trying to normalize the brokers table columns.

Note 1:

While eager loading dependencies, AR executes one SQL per dependency. In your case AR will execute main sql + broker sql. Since you are trying to get one row there isn't much gain. When you are trying to access N rows you will avoid the N+1 problem if you eager-load the dependencies.

Note 2:

In some cases it might be beneficial to use custom eager loading strategies. Let us assume that you just want to get the associated broker name for the integration. You can optimize your sql as follows:

integration = Integration.first(
  :select => "integrations.*, brokers.name broker_name",
  :conditions=>{:integration_name => params[:integration_name]}, 
  :joins => :broker)

integration.broker_name # prints the broker name

The object returned by the query will have all the aliased columns in the select clause.

Above solution will not work when you want to return the Integration object even when there is no corresponding Broker object. You have to use OUTER JOIN.

integration = Integration.first(
  :select => "integrations.*, brokers.name broker_name",
  :conditions=>{:integration_name => params[:integration_name]}, 
  :joins => "LEFT OUTER JOIN brokers ON brokers.integration_id = integrations.id")

Upvotes: 8

Frederick Cheung
Frederick Cheung

Reputation: 84114

The :joins options just makes active record add a join clause to the query. It doesn't actually make active record do anything with the rows that have been returned. The association isn't loaded and so accessing it triggers a query

The :include option is all about loading associations ahead of time. Active record has two strategies for doing this. One is via a big join query and one is by firing one query per association. The default is the latter, which is why you see two queries.

On rails 3.x you can decide which of those strategies you want by doing Integration.preload(:broker) or Integration.eager_graph(:broker).

There is no such facility in rails 2.x, so the only thing you can do is trick the heuristics used to determine the strategy. Whenever rails thinks that the order clause, select clause or conditions refer to columns on the included association it switches to the joins strategy (because it is the only one that works in that case).

For example doing something like

Integration.first(:conditions => {...}, :include => :broker, :select => 'brokers.id as ignored')

should force the alternate strategy (and active record actually ignores the select option in this case).

Upvotes: 2

Related Questions