Reputation: 351
I have been learning more about making my sql queries more efficient, but I don't know if I'm doing this query properly.
I'm trying to find a set in one group, but not the other group. This is the original query which works:
SELECT
cNumber
FROM
[dataset.table]
WHERE
DATE(date) >= '2016-01-01'
AND cNumber NOT IN (
SELECT
cNumber
FROM
[dataset.table]
WHERE
date BETWEEN DATEDIFF(CURRENT_DATE(),-5, "QUARTER")
AND DATEDIFF(CURRENT_DATE(), -1, "QUARTER"))
GROUP BY
1
When I change my query to the one below, I get a larger number of results:
SELECT
cNumber
FROM
dataset.table
WHERE
DATE(date) >= '2016-01-01'
AND date NOT BETWEEN DATEDIFF(CURRENT_DATE(), -5, "QUARTER")
AND DATEDIFF(CURRENT_DATE(), -1, "QUARTER")
GROUP BY
1
For this type of query, do I need to include a NOT IN
subquery or should I could I just use the second query?
I think I am confused in the second query by trying to include the AND orderCreationDate NOT BETWEEN
. I thought that since I'm doing a subquery on the same table, that I would be able to wrap it up into one query.
Upvotes: 0
Views: 65
Reputation: 172994
for BigQuery (Legacy SQL) try below
SELECT cNumber
FROM [dataset.table]
GROUP BY cNumber
HAVING MAX(DATE(date)) >= '2016-01-01'
AND SUM(CASE
WHEN DATE(date) BETWEEN DATE(DATE_ADD(CURRENT_DATE(), -5, "QUARTER"))
AND DATE(DATE_ADD(CURRENT_DATE(), -1, "QUARTER"))
THEN 1 ELSE 0
END) = 0
Upvotes: 1
Reputation: 1269623
I think this may do what you want:
SELECT cNumber
FROM [dataset.table]
GROUP BY cnumber
HAVING MAX(DATE(date)) >= '2016-01-01' AND
SUM(CASE WHEN date BETWEEN DATEDIFF(CURRENT_DATE(), -5, "QUARTER") AND
DATEDIFF(CURRENT_DATE(), -1, "QUARTER"))
THEN 1 ELSE 0
END) = 0;
That is, choose cnumber
s that have a record sometime since 2016-01-01 and don't have one in the other period specified.
Whether this is faster or not, I do not know. You have to try it out.
Upvotes: 1