Reputation: 47
Consider the following data (data will be sorted by date/time) : -
Date Plan Value
2021-06-01 11111 10
2021-06-02 22222 20
2021-06-03 22222 30
2021-06-04 11111 40
I would like to increment a grouping column, based on "Plan" column so that I get the following output : -
2021-06-01 11111 10 1
2021-06-02 22222 20 2
2021-06-03 22222 30 2
2021-06-04 11111 40 3
Note that column 2 repeats itself, and is a different group when sorted by date.
I'm doing this to achieve the following result : -
Min(Date) Max(Date) Plan Sum(Value)
2021-06-01 2021-06-01 11111 10
2021-06-02 2021-06-03 22222 50
2021-06-04 2021-06-04 11111 40
I've tried various rank combinations, but can't get what I want. I can GROUP BY "Plan", but that obviously combines all data, and the Min/Max dates will overlap.
Thanks in advance.
Edit: -
I've written 2 SQL's, one with MATCH_RECOGNIZE and one without. The results are slightly different. Note the differences in the "End" column.
SQL 1 : -
SELECT sql_id "SQL Id",
MIN(begin_interval_time) "Begin",
MAX(end_interval_time) "End",
plan_hash_value "Plan Hash Value",
SUM(executions_delta) "Execs",
AVG(avg_exec_time) "Avg Exec Time",
AVG(avg_l_io) "Avg L I/O",
SUM(rows_processed_delta) "Rows Processed"
FROM (
SELECT sql_id,
begin_interval_time,
end_interval_time,
plan_hash_value,
NVL(executions_delta,0) executions_delta,
(elapsed_time_delta/DECODE(NVL(executions_delta,0),0,1,executions_delta))/1000000 avg_exec_time,
(buffer_gets_delta/DECODE(NVL(buffer_gets_delta,0),0,1,executions_delta)) avg_l_io,
rows_processed_delta,
ROW_NUMBER() OVER (ORDER BY begin_interval_time) seqnum1,
ROW_NUMBER() OVER (PARTITION BY plan_hash_value ORDER BY begin_interval_time) seqnum2
FROM dba_hist_sqlstat s1, dba_hist_snapshot s2
WHERE sql_id = '5wg8n52rynpvj'
AND s2.snap_id = s1.snap_id
AND s2.instance_number = s1.instance_number
AND executions_delta > 0
)
GROUP BY sql_id, plan_hash_value, seqnum1-seqnum2
ORDER BY 1,2
Output 1: -
SQL Id Begin End Plan Hash Value Execs Avg Exec Time Avg L I/O Rows Processed
------------- ------------------------------ ------------------------------ --------------- ---------- ------------- ---------- --------------
5wg8n52rynpvj 04-MAY-21 12.00.09.132 AM 09-MAY-21 06.30.58.395 AM 748165266 156094 1.30704831 110135.57 1826328853
5wg8n52rynpvj 09-MAY-21 08.30.12.526 AM 11-MAY-21 04.30.31.848 AM 3874464562 42102 16.9870354 1481441.73 505289699
5wg8n52rynpvj 11-MAY-21 04.30.31.848 AM 23-MAY-21 06.30.03.959 AM 748165266 351686 .521392859 99585.4432 4102214059
5wg8n52rynpvj 23-MAY-21 08.30.13.248 PM 26-MAY-21 10.00.25.290 PM 3693159187 113123 16.0489399 1416706.63 1360240951
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM 26-MAY-21 10.15.36.640 PM 748165266 193 .652365228 158979.565 3477270
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM 26-MAY-21 10.15.36.640 PM 3693159187 172 15.2876871 1658063.17 1679077
5wg8n52rynpvj 26-MAY-21 10.15.36.640 PM 12-JUN-21 02.30.01.255 PM 748165266 472571 .84639311 110483.743 5755450592
5wg8n52rynpvj 12-JUN-21 08.15.39.473 PM 14-JUN-21 09.15.14.253 AM 2942889555 8613 5.45223841 418713.443 118219750
5wg8n52rynpvj 14-JUN-21 09.15.14.253 AM 18-JUN-21 11.30.22.019 AM 748165266 165678 .483631221 69569.0093 1915023703
SQL 2 : -
WITH x AS (
SELECT sql_id,
begin_interval_time,
end_interval_time,
plan_hash_value,
NVL(executions_delta,0) executions_delta,
(elapsed_time_delta/DECODE(NVL(executions_delta,0),0,1,executions_delta))/1000000 avg_exec_time,
(buffer_gets_delta/DECODE(NVL(buffer_gets_delta,0),0,1,executions_delta)) avg_l_io,
rows_processed_delta
FROM dba_hist_sqlstat s1, dba_hist_snapshot s2
WHERE sql_id = '5wg8n52rynpvj'
AND s2.snap_id = s1.snap_id
AND s2.instance_number = s1.instance_number
AND executions_delta > 0
)
SELECT sql_id "SQL Id",
min_interval_time "Begin",
max_interval_time "End",
plan_hash_value "Plan Hash Value",
sum_executions_delta "Execs",
avg_exec_time "Avg Exec Time",
avg_l_io "Avg L I/O",
sum_rows_processed_delta "Rows Processed"
FROM x
MATCH_RECOGNIZE (
ORDER BY begin_interval_time
MEASURES sql_id AS sql_id,
f.plan_hash_value AS plan_hash_value,
f.begin_interval_time AS min_interval_time,
LAST(begin_interval_time) AS max_interval_time,
SUM(executions_delta) AS sum_executions_delta,
AVG(avg_exec_time) AS avg_exec_time,
AVG(avg_l_io) AS avg_l_io,
SUM(rows_processed_delta) AS sum_rows_processed_delta
PATTERN (f r*)
DEFINE r AS plan_hash_value = f.plan_hash_value
)
ORDER BY 1,2
Output 2 : -
SQL Id Begin End Plan Hash Value Execs Avg Exec Time Avg L I/O Rows Processed
------------- ------------------------------ ------------------------------ --------------- ---------- ------------- ---------- --------------
5wg8n52rynpvj 04-MAY-21 12.00.09.132 AM 09-MAY-21 06.15.48.898 AM 748165266 156094 1.30704831 110135.57 1826328853
5wg8n52rynpvj 09-MAY-21 08.30.12.526 AM 11-MAY-21 04.15.21.707 AM 3874464562 42102 16.9870354 1481441.73 505289699
5wg8n52rynpvj 11-MAY-21 04.30.31.848 AM 23-MAY-21 06.15.55.101 AM 748165266 351686 .521392859 99585.4432 4102214059
5wg8n52rynpvj 23-MAY-21 08.30.13.248 PM 26-MAY-21 09.45.15.070 PM 3693159187 113123 16.0489399 1416706.63 1360240951
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM 26-MAY-21 10.00.25.290 PM 748165266 193 .652365228 158979.565 3477270
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM 26-MAY-21 10.00.25.290 PM 3693159187 172 15.2876871 1658063.17 1679077
5wg8n52rynpvj 26-MAY-21 10.15.36.640 PM 12-JUN-21 02.15.48.069 PM 748165266 472571 .84639311 110483.743 5755450592
5wg8n52rynpvj 12-JUN-21 08.15.39.473 PM 14-JUN-21 09.00.44.608 AM 2942889555 8613 5.45223841 418713.443 118219750
5wg8n52rynpvj 14-JUN-21 09.15.14.253 AM 18-JUN-21 11.15.08.574 AM 748165266 165678 .483631221 69569.0093 1915023703
Upvotes: 1
Views: 328
Reputation:
with
sample_data (dt, plan, value) as (
select date '2021-06-01', 11111, 10 from dual union all
select date '2021-06-02', 22222, 20 from dual union all
select date '2021-06-03', 22222, 30 from dual union all
select date '2021-06-04', 11111, 40 from dual
)
select min_date, max_date, plan, sum_value
from sample_data
match_recognize(
order by dt
measures f.plan as plan,
f.dt as min_date,
last(dt) as max_date,
sum(value) as sum_value
pattern (f r*)
define r as plan = f.plan
)
;
MIN_DATE MAX_DATE PLAN SUM_VALUE
---------- ---------- ---------- ----------
2021-06-01 2021-06-01 11111 10
2021-06-02 2021-06-03 22222 50
2021-06-04 2021-06-04 11111 40
The with
clause is not part of the query; I included it for testing purposes only. Remove it, and use your actual table and column names in the select
statement (main query).
Limitation - the match_recognize
clause is only available since Oracle 12.1.
Upvotes: 1
Reputation: 1270793
This is a type of gaps-and-islands problem. The simplest method for your final result is the difference of row numbers:
select plan, min(date), max(date), sum(value)
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by plan order by date) as seqnum_2
from t
) t
group by plan, (seqnum - seqnum_2);
Actually, if the dates are sequential with no gaps, a simpler version is:
select plan, min(date), max(date), sum(value)
from (select t.*,
row_number() over (partition by plan order by date) as seqnum_2
from t
) t
group by plan, (date - seqnum_2);
Upvotes: 1