Finding all two column duplicates in SQL table where later value of third column is greater than earlier value

Question

Fairly new to SQL and it would be great to have a slicker take on this problem.

Here's an example table:

col1	col2	col3	date
1	1	2	2011-01-22
1	1	4	2011-01-27
3	3	2	2011-01-20
3	8	4	2011-01-12
3	8	2	2011-01-30
4	1	3	2011-01-09
5	3	3	2011-01-27
5	4	2	2011-01-22

and I'd like to return all distinct col1, col2 values where

The col1 and col2 values are the same (duplicated) and
The latest col3 value is greater than the earliest one

In this case, the result should be

col1	col 2
1	1

I can do

FROM table
GROUP BY col1, col2
HAVING count(*) > 1

to get the duplicate col1, col2 tuples but from there my solution just gets ugly, using multiple nested subqueries and sometimes repeating the same subquery.

What's a clean way to do this? It's in SQLite.

Tim Biegeleisen · Accepted Answer

I would use ROW_NUMBER here with pivoting logic:

WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY col1, col2
                           ORDER BY date) rn
    FROM yourTable
)

SELECT col1, col2
FROM cte
GROUP BY col1, col2
HAVING MAX(CASE WHEN rn = 2 THEN col3 END) >
       MAX(CASE WHEN rn = 1 THEN col3 END);

Demo

This approach is very close to what you are currently doing. The only addition is a HAVING clause which compares the more recent col3 value against the older one. Note that this answer assumes that your "duplicates" would only be occurring as pairs, and not triplets, etc.

Edit:

If, for some reason, you really can't use window functions, e.g. because you are using a really old version of SQLite, we can still use roughly the above approach with a join:

SELECT t1.col1, t1.col2
FROM yourTable t1
INNER JOIN
(
    SELECT col1, col2, MIN(date) AS min_date, MAX(date) AS max_date
    FROM yourTable
    GROUP BY col1, col2
) t2
    ON t2.col1 = t1.col1 AND t2.col2 = t1.col2
GROUP BY t1.col1, t1.col2
HAVING
    MAX(CASE WHEN t1.date = t2.max_date THEN col3 END) >
    MAX(CASE WHEN t1.date = t2.min_date THEN col3 END);

Finding all two column duplicates in SQL table where later value of third column is greater than earlier value

Answers (2)

Demo

Demo

Related Questions