dthree
dthree

Reputation: 20730

Efficient alternative to subqueries in SQL Server

Take the following two mock tables:

dbo.animal

id  name
1   dog
2   cat

dbo.animal_food

id  animal_id  food_id  active
1   1          4        1
2   1          5        1

I need to query against animal with multiple subqueries of animal_food based on different sorted results. Something like this:

 select name, 
        (   select top 1 food_id 
              from animal_food 
             where animal_id = animal.id 
             order by food_id desc) as max_food_id,
        (   select top 1 food_id 
              from animal_food 
             where animal_id = animal.id 
               and active = 1
             order by food_id desc) as max_active_food_id,
  from animal

This is obviously very inefficient---I need to speed it up. However, I can't figure out how to refactor this into a join that would improve performance.

Upvotes: 0

Views: 2892

Answers (4)

AdrianChodkowski
AdrianChodkowski

Reputation: 114

You can also use cross apply function and check which way performs better

 select name, 
        max_food_id.food_id AS max_food_id,
        max_active_food_id.food_id AS max_active_food_id,
  from animal
  cross apply (
   select top 1 food_id 
              from animal_food 
             where animal_id = animal.id 
             order by food_id desc
) AS max_food_id
cross apply
(   select top 1 food_id 
              from animal_food 
             where animal_id = animal.id 
               and active = 1
             order by food_id desc) as max_active_food_id

Upvotes: 1

Ed Gibbs
Ed Gibbs

Reputation: 26333

This looks like a simple aggregate query, with the only twist being that the third column only considers the max food_id where active = 1. If that's the case, this will do the job without subqueries:

SELECT
  name,
  MAX(food_id) AS max_food_id,
  MAX(CASE WHEN active = 1 THEN food_id END) AS max_active_food_id
FROM animal
JOIN animal_food = animal.id = animal_food.id
GROUP BY name

The MAX(CASE WHEN active = 1 THEN food_id END) will return NULL if active doesn't equal 1, and nulls are ignored for aggregates like MAX.

Upvotes: 1

Igor
Igor

Reputation: 62213

I am not sure about performance but here is the query using joins and aggregates. Is this what you are looking for or have you already tried this?

select animal.name
    , max(animal_food_all.food_id) as max_food_id
    , max(animal_food_active.food_id) as max_active_food_id
from animal
    left outer join animal_food as animal_food_all on animal.id = animal_food_all.animal_id
    left outer join animal_food as animal_food_active on animal.id = animal_food_active.animal_id and animal_food_active.active = 1
GROUP BY animal.name

I used outer joins in case there are animals that do not have a food listed, if you want those omitted you can change it inner join but either way it will probably have little (if any) impact on performance.

Upvotes: 0

SlimsGhost
SlimsGhost

Reputation: 2909

If you just want the two food_id values you specified, you can use join with group by and what's called conditional aggregation, like this:

select 
    name, 
    max(animal_food.food_id) as max_food_id,
    max(case when animal_food.active = 1 then animal_food.food_id else null end) as max_active_food_id,
from animal
inner join animal_food on animal.animal_id = animal_food.animal_id
group by animal.animal_id, animal.name

Upvotes: 1

Related Questions