Reputation: 4643
Fairly new to SQL and it would be great to have a slicker take on this problem.
Here's an example table:
col1 | col2 | col3 | date |
---|---|---|---|
1 | 1 | 2 | 2011-01-22 |
1 | 1 | 4 | 2011-01-27 |
3 | 3 | 2 | 2011-01-20 |
3 | 8 | 4 | 2011-01-12 |
3 | 8 | 2 | 2011-01-30 |
4 | 1 | 3 | 2011-01-09 |
5 | 3 | 3 | 2011-01-27 |
5 | 4 | 2 | 2011-01-22 |
and I'd like to return all distinct col1, col2 values where
In this case, the result should be
col1 | col 2 |
---|---|
1 | 1 |
I can do
FROM table
GROUP BY col1, col2
HAVING count(*) > 1
to get the duplicate col1, col2 tuples but from there my solution just gets ugly, using multiple nested subqueries and sometimes repeating the same subquery.
What's a clean way to do this? It's in SQLite.
Upvotes: 0
Views: 61
Reputation: 521419
I would use ROW_NUMBER
here with pivoting logic:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY col1, col2
ORDER BY date) rn
FROM yourTable
)
SELECT col1, col2
FROM cte
GROUP BY col1, col2
HAVING MAX(CASE WHEN rn = 2 THEN col3 END) >
MAX(CASE WHEN rn = 1 THEN col3 END);
This approach is very close to what you are currently doing. The only addition is a HAVING
clause which compares the more recent col3
value against the older one. Note that this answer assumes that your "duplicates" would only be occurring as pairs, and not triplets, etc.
Edit:
If, for some reason, you really can't use window functions, e.g. because you are using a really old version of SQLite, we can still use roughly the above approach with a join:
SELECT t1.col1, t1.col2
FROM yourTable t1
INNER JOIN
(
SELECT col1, col2, MIN(date) AS min_date, MAX(date) AS max_date
FROM yourTable
GROUP BY col1, col2
) t2
ON t2.col1 = t1.col1 AND t2.col2 = t1.col2
GROUP BY t1.col1, t1.col2
HAVING
MAX(CASE WHEN t1.date = t2.max_date THEN col3 END) >
MAX(CASE WHEN t1.date = t2.min_date THEN col3 END);
Upvotes: 1
Reputation: 164099
You can do it with FIRST_VALUE()
window function:
SELECT col1, col2
FROM (
SELECT DISTINCT col1, col2,
FIRST_VALUE(col3) OVER (PARTITION BY col1, col2 ORDER BY date) earlier_col3,
FIRST_VALUE(col3) OVER (PARTITION BY col1, col2 ORDER BY date DESC) latest_col3
FROM tablename
)
WHERE latest_col3 > earlier_col3
See the demo.
Results:
> col1 | col2
> ---: | ---:
> 1 | 1
Upvotes: 1