Reputation: 162
I have a simple database table (Postgres) consisting of 3 attributes (f1,f2 and f3) ordered by date. (db<>fiddle).
I want to transform (group?) the result into the following table:
f1 | f2 | f3 | beginn | end |
---|---|---|---|---|
A | A | A | 31.12.2017 | 05.01.2018 |
B | B | B | 06.01.2018 | 10.01.2018 |
A | A | A | 11.01.2018 | 16.01.2018 |
I tried (but obviously failed) with:
SELECT
a.f1
,a.f2
,a.f3
,MIN(datum) AS beginn
,MAX(datum) AS end
FROM
test1 a
GROUP BY 1,2,3
I tried a bit with window-functions but that wasn't leading to anywhere.
Do you have any ideas how I can get to the desired result?
Upvotes: 3
Views: 1725
Reputation: 15893
Thanks a lot for adding (db<>fiddle) link in the question.
I have grouped the rows according to the consecutive occurrence of f1,f2 and f3. That means if same f,f2 and f3 appears in the table but after another set of f1,f2,f3 then those will be considered as new set while calculating beginn and ends.
So even when there is any missing dates between rows this will work
Schema:
DROP TABLE IF EXISTS test1;
CREATE TABLE test1 (
datum DATE
,f1 VARCHAR
,f2 VARCHAR
,f3 VARCHAR)
;
INSERT INTO test1 VALUES
('31.12.2017', 'A', 'A', 'A')
,('01.01.2018', 'A', 'A', 'A')
,('02.01.2018', 'A', 'A', 'A')
,('03.01.2018', 'A', 'A', 'A')
,('04.01.2018', 'A', 'A', 'A')
,('05.01.2018', 'A', 'A', 'A')
,('06.01.2018', 'B', 'B', 'B')
,('07.01.2018', 'B', 'B', 'B')
,('08.01.2018', 'B', 'B', 'B')
,('09.01.2018', 'B', 'B', 'B')
,('10.01.2018', 'B', 'B', 'B')
,('11.01.2018', 'A', 'A', 'A')
,('12.01.2018', 'A', 'A', 'A')
,('13.01.2018', 'A', 'A', 'A')
,('14.01.2018', 'A', 'A', 'A')
,('15.01.2018', 'A', 'A', 'A')
,('16.01.2018', 'A', 'A', 'A');
Query:
WITH cte AS (
SELECT *
, CASE WHEN f1 = LAG(f1) OVER (ORDER BY datum) THEN 0 ELSE 1 END AS same
FROM test1
),
cte2 as
(
SELECT *, SUM(same) OVER ( ORDER BY datum) AS rnk
FROM cte
)
select f1,f2,f3,min(datum) beginn, max(datum) ends from cte2
group by f1,f2,f3,rnk
order by beginn
Output:
f1 | f2 | f3 | beginn | ends |
---|---|---|---|---|
A | A | A | 2017-12-31 | 2018-01-05 |
B | B | B | 2018-01-06 | 2018-01-10 |
A | A | A | 2018-01-11 | 2018-01-16 |
db<>fiddle here
Upvotes: 2
Reputation: 1269583
If you enumerate the rows for each combination of f1
, f2
, and f3
, then subtract that number of days from datum
, then the value will be constant on adjacent days where the three columns are the same.
The rest is just aggregation:
select f1, f2, f3, min(datum), max(datum)
from (select t1.*,
row_number() over (partition by f1, f2, f3 order by datum) as seqnum
from test1 t1
) t1
group by f1, f2, f3, datum - seqnum * interval '1 day'
order by min(datum);
Here is a db<>fiddle.
Upvotes: 3