Reputation:
A colleague wrote this piece of SQL (SQL Server 2012):
SELECT
a.account_id
,(SELECT SUM(e.amount)
FROM event e
WHERE e.event_type_id <> 47
AND e.master_comm_id = (SELECT c.comm_id
FROM comm c
WHERE c.item_id = a.item_id
AND c.comp_type_id = 20
AND c.comm_type_id = 485))
FROM account a
However, there are cases where there are multiple master_comm_ids against an event, and so the query fails (Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.). I only want the first master_comm_id, ie. the MIN one.
I have looked at various similar questions and attempted various things to achieve this (using MIN or ROW_NUMBER and rearranging the query to use joins etc) but I must be missing something obvious as everything has either resulted in SQL errors or the wrong data or not fixed the issue.
Can anyone help me just find the min master_comm_id to then use in the subquery?
Upvotes: 0
Views: 2363
Reputation: 13949
SELECT a.account_id,
oa.[amount]
FROM account a
OUTER APPLY (
SELECT SUM(e.amount) [amount]
FROM event e
WHERE e.event_type_id <> 47
AND EXISTS (
SELECT 1
FROM comm c
WHERE e.master_comm_id = c.comm_id
AND c.item_id = a.item_id
AND c.comp_type_id IN (20, 485)
)
) oa
if what you're saying about I only want the first master_comm_id, ie. the MIN one.
means you only want the sum from the min(master_comm_id)
then you can try something like this.
SELECT a.account_id,
oa.[amount]
FROM account a
OUTER APPLY (
SELECT MIN(e.master_comm_id) master_comm_id
FROM event e
WHERE e.event_type_id <> 47
AND EXISTS (
SELECT 1
FROM comm c
WHERE e.master_comm_id = c.comm_id
AND c.item_id = a.item_id
AND c.comp_type_id IN (20, 485)
)
) mmci
OUTER APPLY (
SELECT SUM(e.amount) [amount]
FROM event e
WHERE e.master_comm_id = mmci.master_comm_id
AND e.event_type_id <> 47
) oa
Upvotes: 0
Reputation: 1269823
The "normal" solution is to use IN
or = ANY
:
SELECT a.account_id,
(SELECT SUM(e.amount)
FROM event e
WHERE e.event_type_id <> 47 AND
e.master_comm_id IN (SELECT c.comm_id
FROM comm c
WHERE c.item_id = a.item_id AND
c.comp_type_id = 20 AND
c.comm_type_id = 485
)
) as sumamount
FROM account a;
There are other alternatives, such as an explicit JOIN
and GROUP BY
.
I actually notice that the subquery does not have a correlation clause. Without one, it will always returns the same value. It is clearer to put such queries in the FROM
clause:
SELECT a.account_id, e.sumamount
FROM account a CROSS JOIN
(SELECT SUM(e.amount) as sumamount
FROM event e
WHERE e.event_type_id <> 47 AND
e.master_comm_id IN (SELECT c.comm_id
FROM comm c
WHERE c.item_id = a.item_id AND
c.comp_type_id = 20 AND
c.comm_type_id = 485
)
) e;
Some databases will execute the subquery in the SELECT
for every row. SQL Server is probably smart enough to optimize that away. But, putting it in the FROM
clause guarantees that it is executed only once.
Upvotes: 2
Reputation: 5482
Top 1 works, but only if the data is ordered (per below SO)
You mentioned you tried MIN, but where? This may work (you are very close):
SELECT
a.account_id
,(SELECT SUM(e.amount)
FROM event e
WHERE e.event_type_id <> 47
AND e.master_comm_id = (SELECT **MIN**(c.comm_id)
FROM comm c
WHERE c.item_id = a.item_id
AND c.comp_type_id = 20
AND c.comm_type_id = 485))
FROM account
MAX vs Top 1 - which is better?
Upvotes: 0
Reputation: 761
Try this one. Since I don't have sample data, I cannot check, but it looks obvious - select only top 1, while the select list is ordered by the ID ascending...
SELECT
a.account_id
,(SELECT SUM(e.amount)
FROM event e
WHERE e.event_type_id <> 47
AND e.master_comm_id = (SELECT top 1 c.comm_id
FROM comm c
WHERE c.item_id = a.item_id
AND c.comp_type_id = 20
AND c.comm_type_id = 485
ORDER BY C.COMM_ID ASC))
FROM account a
Upvotes: 2