Steve
Steve

Reputation: 3095

Mysql select by condition unless another condition is met

I have a table that records transactions. I want to select all of the IDs that have exactly 4 transactions of value 2000, but then exclude those that have a single transaction of 2500.

SELECT t.tuserid, COUNT(*) AS CNT
  FROM transactions t
 WHERE t.amt = 2000
 GROUP BY t.tuserid
 HAVING CNT = 4;

That part is simple enough, but I'm not sure how to efficiently exclude anyone who has any transaction with a t.amt = 2500. Would a simple sub query on the where clause be the most efficient?

SELECT t.tuserid, COUNT(*) AS CNT
  FROM transactions t
 WHERE t.amt = 2000
 AND t.tuserid NOT IN (SELECT x.tuserid FROM transactions x WHERE x.amt=2500)
 GROUP BY t.tuserid
 HAVING CNT = 4;

The transactions table is large and I'm not sure that a sub-query is the most efficient way to run this process.

Upvotes: 4

Views: 76

Answers (2)

Fleury26
Fleury26

Reputation: 585

I don't have the resources to test this, but an alternative without a subquery could be to use only SUM

SELECT 
    t.tuserid, 
    SUM(CASE WHEN t.amt = 2000 THEN 1 ELSE 0 END CASE) AS CNT1, 
    SUM(CASE WHEN t.amt = 2500 THEN 1 ELSE 0 END CASE) AS CNT2
FROM 
    transactions t
GROUP BY 
    t.tuserid
HAVING 
    CNT1 = 4 AND CNT2 = 0;

Upvotes: 0

snap
snap

Reputation: 1925

Yes a sub query is fine here. Your subquery does not use any "variables" from the outer query so it's not a correlated subquery. A correlated is usually "executed" for each row of the outer query and are often a performance issue.

Your subquery would only be called one time per outer query call.

In addition it's allways a good idea to think about your indexes, so it would be a good idea to put one index on your used field inside the where clause. Then both queries should be in a complexity class of O(log(n)) which should be really fast even for big data tables.

Upvotes: 3

Related Questions