Reputation: 39
How can I convert the following query to inner join and avoid using distinct to optimise performance.
select distinct(GroupId)
from BatchQuotaIndividualQuotas
where BatchQuotaCommonSettingsID = 58
and
GroupId not in
(
select distinct(groupid)
from BatchQuotaIndividualQuotas
where BatchQuotaCommonSettingsID = 58
and ObjectiveFunctionTotalResultID is null
)
GroupId is not primary key. There are multiple rows corresponding to a single GroupId. I want to select the GroupIds for which none of the ObjectiveFunctionTotalResultID is null. –
Upvotes: 1
Views: 173
Reputation: 44316
if you remove distinct from the where clause, your query will run faster.
Here are 2 better ways of writing it
count(ObjectiveFunctionTotalResultID) will count not null values
SELECT GroupId
FROM BatchQuotaIndividualQuotas
WHERE BatchQuotaCommonSettingsID = 58
GROUP BY GroupId
HAVING count(ObjectiveFunctionTotalResultID) = count(*)
or:
EXCEPT will include distinct
SELECT GroupId
FROM BatchQuotaIndividualQuotas
WHERE BatchQuotaCommonSettingsID = 58
EXCEPT
SELECT GroupId
FROM BatchQuotaIndividualQuotas
WHERE BatchQuotaCommonSettingsID = 58
AND ObjectiveFunctionTotalResultID is null
Upvotes: 1
Reputation: 4502
You can achieve a similar result using a LEFT JOIN
and then setting the join result to NULL
.
select distinct GroupId
from BatchQuotaIndividualQuotas A
left join BatchQuotaIndividualQuotas B ON B.ObjectiveFunctionTotalResultID IS NULL
AND B.GroupId = A.GroupId
AND B.BatchQuotaCommonSettingsID = 58
where A.BatchQuotaCommonSettingsID = 58
AND B.GroupId IS NULL
You can also achieve the same result with GROUP BY
and HAVING
, although I doubt this approach would be faster.
select GroupId
from BatchQuotaIndividualQuotas
where BatchQuotaCommonSettingsID = 58
group by GroupId
having SUM(CASE WHEN ObjectiveFunctionTotalResultID IS NULL THEN 1 ELSE 0 END) = 0
Upvotes: 2
Reputation: 116100
First of all, distinct is not a function, but a modifier of the select statement. You select a distinct combination of field values rather than a distinct single field. The parentheses will actually call a syntax error if you have more fields. So it's not
select
distinct(groupid)
but
select distinct
groupid
Secondly, you don't need the inner distinct at all. Duplicates may exist in an in
list, and the dbms is likely to optimize this for you already. By adding distinct
yourself in the inner query, you are actively preventing the optimizer to do its job.
Thirdly, you will still need distinct
on the outer query if you want to have distinct group ids. Inner join doesn't change that. Besides, since you want to have records that must not match, an inner join wouldn't do the trick. A left join could do it, but if I interpret your query correctly, you could just write it as:
select distinct
GroupId
from
BatchQuotaIndividualQuotas
where
BatchQuotaCommonSettingsID = 58 and
ObjectiveFunctionTotalResultID is not null
You could change it to group by
like I do below, but the effect is the same. Group by just creates groups, and has to remove duplicates as well. Group by is mostly used for aggregations. I'm not sure about the inner workings, but group by is theoretically a harder process, because the database needs to create actual groups for these aggregations, instead of just filtering out duplicates. Probably this will be optimized as well, so in the end, the query below will perform and behave the same as the one above.
select
GroupId
from
BatchQuotaIndividualQuotas
where
BatchQuotaCommonSettingsID = 58 and
ObjectiveFunctionTotalResultID is not null
group by
GroupId
Upvotes: 2
Reputation: 3266
Could you try this please?
select GroupId
from BatchQuotaIndividualQuotas
where BatchQuotaCommonSettingsID = 58
and ObjectiveFunctionTotalResultID IS NOT NULL
GROUP BY GroupId
Same query (If I am correct) with some revised flow. Instead of filtering out the IDs by a subquery, it does it in place.
Upvotes: 0