Reputation: 83
I have the below table:
The dates will run until Jan of next year for both periods and it has more countries, cat1, 2, 3. This is just showing a few combinations
yyyymmdd | period | country | cat1 | cat2 | cat3 | score |
---|---|---|---|---|---|---|
20191201 | A | US | Sports | Cycle | long | 33 |
20191201 | A | US | Sports | Cycle | Short | 35 |
20191201 | A | US | Sports | Running | long | 28 |
20191201 | A | US | Sports | Running | short | 29 |
20191201 | A | US | Arts | |||
20191202 | A | US | Sports | Cycle | long | 43 |
20191202 | A | US | Sports | Cycle | Short | 44 |
20191202 | A | US | Sports | Running | long | 35 |
20191202 | A | US | Sports | Running | short | 37 |
20191202 | A | US | Arts | |||
20201201 | B | US | Sports | Cycle | long | 33 |
20201201 | B | US | Sports | Cycle | Short | 35 |
20201201 | B | US | Sports | Running | long | 28 |
20201201 | B | US | Sports | Running | short | 29 |
20201201 | B | US | Arts |
I like to find the score from day 1 for each groupings (period, country, cat1, cat2, cat3) and add this day 1 score as a additional column so it looks like this:
yyyymmdd | period | country | cat1 | cat2 | cat3 | score | score_day1 |
---|---|---|---|---|---|---|---|
20191201 | A | US | Sports | Cycle | long | 33 | 33 |
20191201 | A | US | Sports | Cycle | Short | 35 | 35 |
20191201 | A | US | Sports | Running | long | 28 | 28 |
20191201 | A | US | Sports | Running | short | 29 | 29 |
20191201 | A | US | Arts | ||||
20191202 | A | US | Sports | Cycle | long | 43 | 33 |
20191202 | A | US | Sports | Cycle | Short | 44 | 35 |
20191202 | A | US | Sports | Running | long | 35 | 28 |
20191202 | A | US | Sports | Running | short | 37 | 29 |
20191202 | A | US | Arts | ||||
20201201 | B | US | Sports | Cycle | long | 41 | 41 |
20201201 | B | US | Sports | Cycle | Short | 38 | 38 |
20201201 | B | US | Sports | Running | long | 50 | 50 |
20201201 | B | US | Sports | Running | short | 51 | 51 |
20201201 | B | US | Arts | ||||
20201202 | B | US | Sports | Cycle | long | 44 | 41 |
20201202 | B | US | Sports | Cycle | Short | 51 | 38 |
20201202 | B | US | Sports | Running | long | 60 | 50 |
20201202 | B | US | Sports | Running | short | 38 | 51 |
20201202 | B | US | Arts |
Will appreciate any help. Thank you!!
Upvotes: 0
Views: 164
Reputation: 2051
Does this answer?
SELECT yyyymmdd, period, country, cat1, cat2, cat3, score, (select score from tbl t2 where yyyymmdd=(select min(yyyymmdd) from tbl where period=t1.period) and t1.period=t2.period and t1.country=t2.country and t1.cat1=t2.cat1 and t1.cat2=t2.cat2 and t1.cat3=t2.cat3 limit 1) score_day1 from tbl t1
Here is a fiddle: https://www.db-fiddle.com/f/njLJnbNkFEdjSNUfopX7Ye/0
Upvotes: 1
Reputation: 1106
You may also try using subquery
and JOIN
.
Query:
SELECT t.yyyymmdd, t.period, t.country, t.cat1, t.cat2, t.cat3, t.score,
sq.score AS score_day1
FROM t
LEFT JOIN
(
SELECT yyyymmdd, period, country, cat1, cat2, cat3, score
FROM
(
SELECT yyyymmdd, period, country, cat1, cat2, cat3, score,
RANK() OVER(PARTITION BY period ORDER BY yyyymmdd ASC) AS rnk
FROM t
) AS r
WHERE rnk=1
) AS sq
ON t.period=sq.period
AND t.country=sq.country
AND t.cat1=sq.cat1
AND t.cat2=sq.cat2
AND t.cat3=sq.cat3
ORDER BY t.period, t.yyyymmdd
Explanation:
period
1
original
tableSee demo in db<>fiddle
Upvotes: 1
Reputation: 13006
You can use left join
but get the latest score first using rownum()
select b.*, a.score
from (select *, row_number() over (partition by period, country, cat1, cat2, cat3
order by yyyymmmdd asc) as rn from test) a
left join test b
on b.period = a.period and b.country=a.country and a.cat1 = b.cat1
and a.cat2 = b.cat2 and a.cat3 = b.cat3 and a.rn=1
see dbfiddle
Upvotes: 1