Bryce
Bryce

Reputation: 447

How to get # of times in a row an item occurred in SQL

I am looking to figure out the correct way to build this statement in SQL (oracle). What I have is a columns in a table that consists of "Date" and "Direction". In the direction column it either is a "North" or "South". I am trying to figure out how to output the # of times in a row before the row occurred (like a Win or Loss Streak) did "North" or "South appear". For example:

SELECT date, direction, ? As Streak FROM temptable order by date ASC;

Sample dataset of 8 rows:

  1. 2022-01-01, North
  2. 2022-01-02, North
  3. 2022-01-03, North
  4. 2022-01-04, North
  5. 2022-01-05, South
  6. 2022-01-06, South
  7. 2022-01-07, South
  8. 2022-01-08, North

The results would be:

date, Direction, Streak

  1. 2022-01-01, North, 0;
  2. 2022-01-02, North, 1;
  3. 2022-01-03, North, 2;
  4. 2022-01-04, North, 3;
  5. 2022-01-05, South, 0;
  6. 2022-01-06, South, 1;
  7. 2022-01-07, South, 2;
  8. 2022-01-08, North, 0;

Upvotes: 0

Views: 116

Answers (2)

Chris Saxon
Chris Saxon

Reputation: 9825

You can do this with SQL Pattern Matching:

with rws as (
  select date'2022-01-01' dt, 'North' dir from dual union all
  select date'2022-01-02' dt, 'North' dir from dual union all
  select date'2022-01-03' dt, 'North' dir from dual union all
  select date'2022-01-04' dt, 'North' dir from dual union all
  select date'2022-01-05' dt, 'South' dir from dual union all
  select date'2022-01-06' dt, 'South' dir from dual union all
  select date'2022-01-07' dt, 'South' dir from dual union all
  select date'2022-01-08' dt, 'North' dir from dual
)
  select * from rws match_recognize (
    order by dt 
    measures 
      count(*) - 1 as streak
    all rows per match
    pattern ( init same* )
    define
      same as dir = init.dir
  );

DT              STREAK DIR  
----------- ---------- -----
01-JAN-2022          0 North
02-JAN-2022          1 North
03-JAN-2022          2 North
04-JAN-2022          3 North
05-JAN-2022          0 South
06-JAN-2022          1 South
07-JAN-2022          2 South
08-JAN-2022          0 North

The pattern is a regular expression using the variables listed in the define clause.

So it's searching for:

  • Any row (init) - this is undefined so "always true"
  • Followed by zero or more same variables
  • These are rows where the value for dir matches that of the init "always true" row

The measures clause defines output columns. count(*) returns the running total of rows matched in the group so far.

The all rows per match clause is necessary because by default match_recognize returns one row per group. So you'd only get one row in the output for each streak.

Upvotes: 1

Stu
Stu

Reputation: 32609

Not specifically Oracle but this is a standard approach using row_number to identify sequential groups, then to count within those groups:

with g as (
    select *, 
      Row_Number() over(order by date) 
       - Row_Number() over(partition by direction order by date) gn
    from t
)
select *, 
 Row_Number() over(partition by gn order by date) - 1 as Streak
from g
order by date;

Upvotes: 2

Related Questions