Reputation: 163
I have a table named Acount has following structure
acc_id name value phase date
1 acc_type trial start t1
1 name1 value1 phase1 t2
1 acc_type trial end t3
1 name2 value2 phase2 t4
1 acc_type trial start t5
1 name3 value3 phase2 t6
1 name4 value4 phase4 t7
1 acc_type trial end t8
Table is sorted based on the date
What I want is to add two column start_date
and end_date
and update the column for a particular pair of phase value start and end
where name
is acc_type
and value
is trial
.
So output will look like:
acc_id name value phase date start_date end_date
1 acc_type trial start t1 t1 t3
1 name1 value1 phase1 t2 t1 t3
1 acc_type trial end t3 t1 t3
1 name2 value2 phase2 t4
1 acc_type trial start t5 t5 t8
1 name3 value3 phase2 t6 t5 t8
1 name4 value4 phase4 t7 t5 t8
1 acc_type trial end t8 t5 t8
I am using PostgreSQL.
Upvotes: 0
Views: 389
Reputation: 1269753
This is complicated, because you need to partition by the starts and ends. If I assume that the proper ordering is the dates, then you can use cumulative max()
:
select t.*,
max(case when phase = 'start' then date end) over
(partition by acc_id order by date) as start_date,
max(case when phase = 'end' then date end) over
(partition by acc_id order by date) as end_date
from t;
EDIT:
I notice that there are values "in-between". This requires more complex logic. The idea is to count the net number of starts and ends, and to use this for filtering:
select t.*,
(case when netstartend > 0 or phase in ('start', 'end')
then max(case when phase = 'start' then date end) over
(partition by acc_id order by date)
end) as start_date,
(case when netstartend > 0 or phase in ('start', 'end')
then max(case when phase = 'end' then date end) over
(partition by acc_id order by date)
end) as end_date
from (select t.*,
sum(case when phase = 'start' then 1 when phase = 'end' then -1 else 0 end) over
(order by date) as netstartend
from t
) t;
Upvotes: 2