Reputation: 3527
Suppose I have a single table actions
like so:
+----+------------+--------+-------+
| id | rdate | clicks | epoch |
+----+------------+--------+-------+
| 1 | 2020-01-01 | 100 | 1200 |
| 1 | 2020-01-01 | 95 | 1100 |
| 1 | 2020-10-12 | 42 | 1000 |
| 1 | 2020-10-12 | 66 | 900 |
+----+------------+--------+-------+
I am trying to write a query that gives me the number of clicks based on the MAX(epoch)
for records with identical values in the id
and rdate
columns - the end result should be: (note, I do not need the epoch
column in the result)
+----+------------+--------+
| id | date | clicks |
+----+------------+--------+
| 1 | 2020-01-01 | 100 |
| 1 | 2020-10-12 | 42 |
+----+------------+--------+
I have tried the following query but the duplicates are still present in the result. The group by
query does remove duplicates when run by itself, but the inner join to get the clicks
does not work as intended.
SELECT
id,
rdate,
clicks
FROM actions a
INNER JOIN (
SELECT
id,
rdate,
MAX(epoch)
from actions
group by
id,
rdate
) b
on a.id = b.id
and a.rdate = b.rdate;
Upvotes: 1
Views: 65
Reputation: 175556
You could use QUALIFY and windowed function:
SELECT *
FROM actions
QUALIFY ROW_NUMBER() OVER(PARTITION BY id, rdate ORDER BY epoch DESC) = 1
Upvotes: 2
Reputation: 520878
I suggest using ROW_NUMBER
here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id, rdate ORDER BY epoch DESC) rn
FROM actions
)
SELECT id, rdate, clicks, epoch
FROM cte
WHERE rn = 1;
If you want to stick with your current join approach, then you need to fix the logic such that the join to the subquery also restricts the number of clicks:
SELECT a1.*
FROM actions a1
INNER JOIN
(
SELECT id, rdate, MAX(epoch) AS max_epoch
FROM actions
GROUP BY id, rdate
) a2
ON a2.id = a1.id AND a2.rdate = a1.rdate AND a2.max_epoch = a1.epoch;
Upvotes: 1