Reputation: 39

Changing not in clause to inner join

How can I convert the following query to inner join and avoid using distinct to optimise performance.

select distinct(GroupId) 
from   BatchQuotaIndividualQuotas 
where  BatchQuotaCommonSettingsID = 58 
       and 
       GroupId not in 
       (
           select distinct(groupid) 
           from   BatchQuotaIndividualQuotas 
           where  BatchQuotaCommonSettingsID = 58 
                  and ObjectiveFunctionTotalResultID is null 
       )

GroupId is not primary key. There are multiple rows corresponding to a single GroupId. I want to select the GroupIds for which none of the ObjectiveFunctionTotalResultID is null. –

Upvotes: 1

Answers (4)

t-clausen.dk

Reputation: 44316

if you remove distinct from the where clause, your query will run faster.

Here are 2 better ways of writing it

count(ObjectiveFunctionTotalResultID) will count not null values

SELECT GroupId
FROM   BatchQuotaIndividualQuotas 
WHERE  BatchQuotaCommonSettingsID = 58 
GROUP BY GroupId
HAVING count(ObjectiveFunctionTotalResultID) = count(*)

or:

EXCEPT will include distinct

SELECT GroupId
FROM   BatchQuotaIndividualQuotas 
WHERE  BatchQuotaCommonSettingsID = 58 
EXCEPT
SELECT GroupId
FROM   BatchQuotaIndividualQuotas 
WHERE  BatchQuotaCommonSettingsID = 58 
AND ObjectiveFunctionTotalResultID is null

Upvotes: 1

Dan

Reputation: 4502

You can achieve a similar result using a LEFT JOIN and then setting the join result to NULL.

select distinct GroupId
from   BatchQuotaIndividualQuotas A
left join BatchQuotaIndividualQuotas B ON B.ObjectiveFunctionTotalResultID IS NULL 
                                      AND B.GroupId = A.GroupId
                                      AND B.BatchQuotaCommonSettingsID = 58
where  A.BatchQuotaCommonSettingsID = 58
   AND B.GroupId IS NULL

You can also achieve the same result with GROUP BY and HAVING, although I doubt this approach would be faster.

select GroupId
from   BatchQuotaIndividualQuotas 
where  BatchQuotaCommonSettingsID = 58
group by GroupId
having SUM(CASE WHEN ObjectiveFunctionTotalResultID IS NULL THEN 1 ELSE 0 END) = 0

Upvotes: 2

GolezTrol

Reputation: 116100

First of all, distinct is not a function, but a modifier of the select statement. You select a distinct combination of field values rather than a distinct single field. The parentheses will actually call a syntax error if you have more fields. So it's not

select
  distinct(groupid)

but

select distinct
  groupid

Secondly, you don't need the inner distinct at all. Duplicates may exist in an in list, and the dbms is likely to optimize this for you already. By adding distinct yourself in the inner query, you are actively preventing the optimizer to do its job.

Thirdly, you will still need distinct on the outer query if you want to have distinct group ids. Inner join doesn't change that. Besides, since you want to have records that must not match, an inner join wouldn't do the trick. A left join could do it, but if I interpret your query correctly, you could just write it as:

select distinct
  GroupId
from 
  BatchQuotaIndividualQuotas 
where 
  BatchQuotaCommonSettingsID = 58 and 
  ObjectiveFunctionTotalResultID is not null

You could change it to group by like I do below, but the effect is the same. Group by just creates groups, and has to remove duplicates as well. Group by is mostly used for aggregations. I'm not sure about the inner workings, but group by is theoretically a harder process, because the database needs to create actual groups for these aggregations, instead of just filtering out duplicates. Probably this will be optimized as well, so in the end, the query below will perform and behave the same as the one above.

select
  GroupId
from 
  BatchQuotaIndividualQuotas 
where 
  BatchQuotaCommonSettingsID = 58 and 
  ObjectiveFunctionTotalResultID is not null
group by
  GroupId

Upvotes: 2

Alex Szabo

Reputation: 3266

Could you try this please?

select GroupId
from   BatchQuotaIndividualQuotas 
where  BatchQuotaCommonSettingsID = 58 
and ObjectiveFunctionTotalResultID  IS NOT NULL
GROUP BY GroupId

Same query (If I am correct) with some revised flow. Instead of filtering out the IDs by a subquery, it does it in place.

Upvotes: 0

Changing not in clause to inner join

Answers (4)

Related Questions