Reputation: 97
I'm using an includes
instead of a join
because it runs faster but the statement is returning an association that doesn't include all of the data I'm looking for. It returns all of the left data, but only the right data that matches the query. Hopefully the examples below help clarify the problem and what I'm trying to achieve.
The join
does seem to do what I'm after from a data and rails association perspective but executes a ton of queries and is much slower.
class Species < ActiveRecord::Base
has_many :species_types, foreign_key: 'species_id', primary_key: "id"
end
class SpeciesTypes < ActiveRecord::Base
belongs_to :species, :foreign_key => "id", :primary_key => "species_id"
end
create_table "species", force: :cascade do |t|
t.bigint "id"
t.string "identifier"
end
create_table "species_types", force: :cascade do |t|
t.bigint "species_id"
t.bigint "type_id"
t.string "name"
end
Table data to help visualize the queries below
id | identifier |
---|---|
1 | furry |
2 | sleek |
3 | hairy |
4 | shiny |
5 | reflective |
6 | rough |
7 | rubbery |
species_id | type_id | identifier |
---|---|---|
1 | 1 | hairy |
1 | 2 | metalic |
2 | 3 | skin |
3 | 1 | hairy |
4 | 2 | metalic |
4 | 3 | skin |
5 | 3 | skin |
5 | 3 | skin |
6 | 2 | metalic |
7 | 2 | metalic |
I know the SpeciesTypes.type_id
, and I'm looking to get all Species
that have that type, including all of their SpeciesTypes.
includes
`species = Species.includes(:species_types).where(:species_types => {:type_id => 1})`
This does return all Species with a matching SpeciesType. However, instead of returning all Species with all SpeciesType it return all Species with only the SpeciesType that match the :type_id
parameter. So, in this case you cannot reference all SpeciesTypes from the Species object (species[0].species_types
). Does not return what was expected, although it makes sense why it does limit to the matched type_id.
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types
[#<SpeciesTypes:0x0000ffff9ad73490
species_id: 1,
type_id: 1,
identifier: hairy>]
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types
[#<SpeciesTypes:0x0000ffff9ad73490
species_id: 1,
type_id: 1,
identifier: hairy>,
<SpeciesTypes:0x0000ffff9ad73490
species_id: 1,
type_id: 2,
identifier: metalic>,
]
joins
This is returning what I'm after (using join
instead of includes
) however the query is much much slower. I think I'm missing something obvious (or not obvious but fundamental)
species = Species.joins(:species_types).where(:species_types => {:type_id => 3})
The above returns the values that I expect but is a much slower query.
Can the includes
query be updated to return all Species with all types that match the known :type_id
?
Upvotes: 0
Views: 356
Reputation: 102036
While its pretty natural to think that Species.includes(:species_types).where(:species_types => {:type_id => 3})
would load all the species and just eager load the species_types that match the where clause thats not how ActiveRecord and SQL actually works.
What this generates in terms of a query something like:
SELECT species.name AS t0_c1, species_types.id AS t1_c1 ...
LEFT OUTER JOIN species_types, t1
ON species_types.specie_id = species.id
WHERE species_types.type_id = ?
When you use includes
and reference the other table it delegates to .eager_load
which loads both tables in a single database query.
The where clause here applies to the entire query and not just the joined assocation. Remember that this query returns a row for every species_types
row (with duplicate data for the species table).
If you wanted to load just the records that match the condition you would need to put the restriction into the JOIN clause:
SELECT species.name AS t0_c1, ...
LEFT OUTER JOIN species_types, t1
ON species_types.specie_id = species.id AND species_types.type_id = ?
Unfortunately ActiveRecord associations do not provide a way to do that.
The easiest solution to the problem is most likely to just query from the other end:
Type.find(1)
.specie_types.where(specie: specie)
.joins
is not the answerYou can't just replace includes
with joins
as they do very things.
joins
just adds an INNER LEFT JOIN
to the query but doesn't actually select any columns from the joined table. Its used to filter the assocation based on the joined table or to select aggregates. Not to prevent N+1 queries.
In this case it's most likely not the first query itself thats slower - rather you're creating a N+1 query when you iterate through specie_types
as the assocation is not eager loaded / preloaded.
includes
does an OUTER LEFT JOIN
and will load the assocatiated records either in one or two queries depending on how its used.
Upvotes: 1