Reputation: 11
I have got in the DB data like you can see below (additional info about dates: date in valid_from is included, date in valid_to is excluded)
obj_number | status | valid_from | valid_to |
---|---|---|---|
A1001 | active | 01.01.2018 | 01.01.2019 |
A1001 | pending | 01.01.2019 | 31.03.2019 |
A1001 | pending | 31.03.2019 | 30.06.2019 |
A1001 | pending | 30.06.2019 | 31.12.2019 |
A1001 | active | 31.12.2019 | 31.12.2020 |
A1001 | active | 31.12.2020 |
I have to merge the data, when status changes, but valid_from should be from the oldest record and valid_to should be from the newest record. The result from the data you can see above should be like you can see below
obj_number | status | valid_from | valid_to |
---|---|---|---|
A1001 | active | 01.01.2018 | 01.01.2019 |
A1001 | pending | 01.01.2019 | 31.12.2019 |
A1001 | active | 31.12.2019 |
how to prepare a sql query to achieve it?
thanks in advance
Upvotes: 1
Views: 140
Reputation: 1269843
This is a type of gaps-and-islands problem. Assuming there are no gaps on adjacent, the difference of row_number()
s is the simplest approach:
select obj_number, status, min(valid_from),
(case when count(*) = count(valid_to)
then max(valid_to)
end)
from (select t.*,
row_number() over (partition by obj_number) order by valid_from) as seqnum_2,
row_number() over (partition by obj_number, status order by valid_from) as seqnum
from t
) t
group by obj_number, status, (seqnum - seqnum_2);
Note the extra logic to handle NULL
values for valid_to
.
Upvotes: 1
Reputation: 35910
You need to check status
and also dates (if not overlapping then different row should be generated for same consecutive status) so you can use the lag
and sum
analytical function along with the group by as follows:
select obj_number, status, min(valid_from) as valid_from,
case when count(case when valid_to is null then 1 end) = 0
then max(valid_to)
end as valid_to
From
(select t.*,
sum(case when lgtodt >= valid_from then 0 else 1 end)
over (partition by obj_number, status order by valid_from) as sm
from
(select t.*,
lag(valid_to) over (partition by obj_number, status order by valid_from) as lgtodt
from your_table t) t ) t
group by obj_number, status, sm
Upvotes: 1
Reputation: 540
Here is a fun way to do it (assuming you are on oracle 12c+):
select *
from tt
match_recognize (
partition by
obj_number
order by
valid_from
measures
init.status as status,
first(valid_from) as valid_from,
last(valid_to) as valid_to
one row per match
pattern (init same_status*)
define
same_status as status = prev(status)
)
;
Using MATCH_RECOGNIZE we can
Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses.
Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define.
Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause.
Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause.
So how does it work?
partition by
obj_numberorder by
rows inside the partitions using valid_frommeasures
pattern
, in this case we take a row and then we try to match as many same_status rows as possible (*
is the 0-many repetition symbol as in regular expressions)define
what same_status means, in this case the row is same_status if it has the same status as the previous rowUpvotes: 2