BenM
BenM

Reputation: 47

Auto Increment For Change In Value in Oracle SQL

Consider the following data (data will be sorted by date/time) : -

Date       Plan  Value
2021-06-01 11111 10
2021-06-02 22222 20
2021-06-03 22222 30
2021-06-04 11111 40

I would like to increment a grouping column, based on "Plan" column so that I get the following output : -

2021-06-01 11111 10 1
2021-06-02 22222 20 2
2021-06-03 22222 30 2
2021-06-04 11111 40 3

Note that column 2 repeats itself, and is a different group when sorted by date.

I'm doing this to achieve the following result : -

Min(Date)  Max(Date)   Plan Sum(Value)
2021-06-01 2021-06-01 11111 10
2021-06-02 2021-06-03 22222 50
2021-06-04 2021-06-04 11111 40

I've tried various rank combinations, but can't get what I want. I can GROUP BY "Plan", but that obviously combines all data, and the Min/Max dates will overlap.

Thanks in advance.

Edit: -

I've written 2 SQL's, one with MATCH_RECOGNIZE and one without. The results are slightly different. Note the differences in the "End" column.

SQL 1 : -

SELECT sql_id                    "SQL Id",
       MIN(begin_interval_time)  "Begin",
       MAX(end_interval_time)    "End",
       plan_hash_value           "Plan Hash Value",
       SUM(executions_delta)     "Execs",
       AVG(avg_exec_time)        "Avg Exec Time",
       AVG(avg_l_io)             "Avg L I/O",
       SUM(rows_processed_delta) "Rows Processed"
FROM (
        SELECT sql_id,
               begin_interval_time,
               end_interval_time,
               plan_hash_value,
               NVL(executions_delta,0) executions_delta,
               (elapsed_time_delta/DECODE(NVL(executions_delta,0),0,1,executions_delta))/1000000 avg_exec_time,
               (buffer_gets_delta/DECODE(NVL(buffer_gets_delta,0),0,1,executions_delta))         avg_l_io,
               rows_processed_delta,
               ROW_NUMBER() OVER (ORDER BY begin_interval_time) seqnum1,
               ROW_NUMBER() OVER (PARTITION BY plan_hash_value ORDER BY begin_interval_time) seqnum2
        FROM  dba_hist_sqlstat s1, dba_hist_snapshot s2
        WHERE sql_id           = '5wg8n52rynpvj'
        AND s2.snap_id         = s1.snap_id
        AND s2.instance_number = s1.instance_number
        AND executions_delta   > 0
)
GROUP BY sql_id, plan_hash_value, seqnum1-seqnum2
ORDER BY 1,2

Output 1: -

SQL Id        Begin                          End                            Plan Hash Value      Execs Avg Exec Time  Avg L I/O Rows Processed
------------- ------------------------------ ------------------------------ --------------- ---------- ------------- ---------- --------------
5wg8n52rynpvj 04-MAY-21 12.00.09.132 AM      09-MAY-21 06.30.58.395 AM            748165266     156094    1.30704831  110135.57     1826328853
5wg8n52rynpvj 09-MAY-21 08.30.12.526 AM      11-MAY-21 04.30.31.848 AM           3874464562      42102    16.9870354 1481441.73      505289699
5wg8n52rynpvj 11-MAY-21 04.30.31.848 AM      23-MAY-21 06.30.03.959 AM            748165266     351686    .521392859 99585.4432     4102214059
5wg8n52rynpvj 23-MAY-21 08.30.13.248 PM      26-MAY-21 10.00.25.290 PM           3693159187     113123    16.0489399 1416706.63     1360240951
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM      26-MAY-21 10.15.36.640 PM            748165266        193    .652365228 158979.565        3477270
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM      26-MAY-21 10.15.36.640 PM           3693159187        172    15.2876871 1658063.17        1679077
5wg8n52rynpvj 26-MAY-21 10.15.36.640 PM      12-JUN-21 02.30.01.255 PM            748165266     472571     .84639311 110483.743     5755450592
5wg8n52rynpvj 12-JUN-21 08.15.39.473 PM      14-JUN-21 09.15.14.253 AM           2942889555       8613    5.45223841 418713.443      118219750
5wg8n52rynpvj 14-JUN-21 09.15.14.253 AM      18-JUN-21 11.30.22.019 AM            748165266     165678    .483631221 69569.0093     1915023703

SQL 2 : -

WITH x AS (
        SELECT sql_id,
               begin_interval_time,
               end_interval_time,
               plan_hash_value,
               NVL(executions_delta,0) executions_delta,
               (elapsed_time_delta/DECODE(NVL(executions_delta,0),0,1,executions_delta))/1000000 avg_exec_time,
               (buffer_gets_delta/DECODE(NVL(buffer_gets_delta,0),0,1,executions_delta))         avg_l_io,
               rows_processed_delta
        FROM  dba_hist_sqlstat s1, dba_hist_snapshot s2
        WHERE sql_id           = '5wg8n52rynpvj'
        AND s2.snap_id         = s1.snap_id
        AND s2.instance_number = s1.instance_number
        AND executions_delta   > 0
)
SELECT sql_id                   "SQL Id",
       min_interval_time        "Begin",
       max_interval_time        "End",
       plan_hash_value          "Plan Hash Value",
       sum_executions_delta     "Execs",
       avg_exec_time            "Avg Exec Time",
       avg_l_io                 "Avg L I/O",
       sum_rows_processed_delta "Rows Processed"
FROM x
MATCH_RECOGNIZE (
    ORDER BY begin_interval_time
    MEASURES sql_id                    AS sql_id,
             f.plan_hash_value         AS plan_hash_value,
             f.begin_interval_time     AS min_interval_time,
             LAST(begin_interval_time) AS max_interval_time,
             SUM(executions_delta)     AS sum_executions_delta,
             AVG(avg_exec_time)        AS avg_exec_time,
             AVG(avg_l_io)             AS avg_l_io,
             SUM(rows_processed_delta) AS sum_rows_processed_delta
    PATTERN  (f r*)
    DEFINE   r AS plan_hash_value = f.plan_hash_value
)
ORDER BY 1,2

Output 2 : -

SQL Id        Begin                          End                            Plan Hash Value      Execs Avg Exec Time  Avg L I/O Rows Processed
------------- ------------------------------ ------------------------------ --------------- ---------- ------------- ---------- --------------
5wg8n52rynpvj 04-MAY-21 12.00.09.132 AM      09-MAY-21 06.15.48.898 AM            748165266     156094    1.30704831  110135.57     1826328853
5wg8n52rynpvj 09-MAY-21 08.30.12.526 AM      11-MAY-21 04.15.21.707 AM           3874464562      42102    16.9870354 1481441.73      505289699
5wg8n52rynpvj 11-MAY-21 04.30.31.848 AM      23-MAY-21 06.15.55.101 AM            748165266     351686    .521392859 99585.4432     4102214059
5wg8n52rynpvj 23-MAY-21 08.30.13.248 PM      26-MAY-21 09.45.15.070 PM           3693159187     113123    16.0489399 1416706.63     1360240951
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM      26-MAY-21 10.00.25.290 PM            748165266        193    .652365228 158979.565        3477270
5wg8n52rynpvj 26-MAY-21 10.00.25.290 PM      26-MAY-21 10.00.25.290 PM           3693159187        172    15.2876871 1658063.17        1679077
5wg8n52rynpvj 26-MAY-21 10.15.36.640 PM      12-JUN-21 02.15.48.069 PM            748165266     472571     .84639311 110483.743     5755450592
5wg8n52rynpvj 12-JUN-21 08.15.39.473 PM      14-JUN-21 09.00.44.608 AM           2942889555       8613    5.45223841 418713.443      118219750
5wg8n52rynpvj 14-JUN-21 09.15.14.253 AM      18-JUN-21 11.15.08.574 AM            748165266     165678    .483631221 69569.0093     1915023703

Upvotes: 1

Views: 328

Answers (2)

user5683823
user5683823

Reputation:

with
  sample_data (dt, plan, value) as (
    select date '2021-06-01', 11111, 10 from dual union all
    select date '2021-06-02', 22222, 20 from dual union all
    select date '2021-06-03', 22222, 30 from dual union all
    select date '2021-06-04', 11111, 40 from dual
  )
select min_date, max_date, plan, sum_value
from   sample_data
match_recognize(
                 order by dt
                 measures f.plan     as plan,
                          f.dt       as min_date,
                          last(dt)   as max_date, 
                          sum(value) as sum_value
                 pattern  (f r*)
                 define   r as plan = f.plan
               )
;

MIN_DATE   MAX_DATE         PLAN  SUM_VALUE
---------- ---------- ---------- ----------
2021-06-01 2021-06-01      11111         10
2021-06-02 2021-06-03      22222         50
2021-06-04 2021-06-04      11111         40

The with clause is not part of the query; I included it for testing purposes only. Remove it, and use your actual table and column names in the select statement (main query).

Limitation - the match_recognize clause is only available since Oracle 12.1.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270793

This is a type of gaps-and-islands problem. The simplest method for your final result is the difference of row numbers:

select plan, min(date), max(date), sum(value)
from (select t.*,
             row_number() over (order by date) as seqnum,
             row_number() over (partition by plan order by date) as seqnum_2
      from t
     ) t
group by plan, (seqnum - seqnum_2);

Actually, if the dates are sequential with no gaps, a simpler version is:

select plan, min(date), max(date), sum(value)
from (select t.*,
             row_number() over (partition by plan order by date) as seqnum_2
      from t
     ) t
group by plan, (date - seqnum_2);

Upvotes: 1

Related Questions