How to get # of times in a row an item occurred in SQL

Question

I am looking to figure out the correct way to build this statement in SQL (oracle). What I have is a columns in a table that consists of "Date" and "Direction". In the direction column it either is a "North" or "South". I am trying to figure out how to output the # of times in a row before the row occurred (like a Win or Loss Streak) did "North" or "South appear". For example:

SELECT date, direction, ? As Streak FROM temptable order by date ASC;

Sample dataset of 8 rows:

2022-01-01, North
2022-01-02, North
2022-01-03, North
2022-01-04, North
2022-01-05, South
2022-01-06, South
2022-01-07, South
2022-01-08, North

The results would be:

date, Direction, Streak

2022-01-01, North, 0;
2022-01-02, North, 1;
2022-01-03, North, 2;
2022-01-04, North, 3;
2022-01-05, South, 0;
2022-01-06, South, 1;
2022-01-07, South, 2;
2022-01-08, North, 0;

Chris Saxon · Accepted Answer

You can do this with SQL Pattern Matching:

with rws as (
  select date'2022-01-01' dt, 'North' dir from dual union all
  select date'2022-01-02' dt, 'North' dir from dual union all
  select date'2022-01-03' dt, 'North' dir from dual union all
  select date'2022-01-04' dt, 'North' dir from dual union all
  select date'2022-01-05' dt, 'South' dir from dual union all
  select date'2022-01-06' dt, 'South' dir from dual union all
  select date'2022-01-07' dt, 'South' dir from dual union all
  select date'2022-01-08' dt, 'North' dir from dual
)
  select * from rws match_recognize (
    order by dt 
    measures 
      count(*) - 1 as streak
    all rows per match
    pattern ( init same* )
    define
      same as dir = init.dir
  );

DT              STREAK DIR  
----------- ---------- -----
01-JAN-2022          0 North
02-JAN-2022          1 North
03-JAN-2022          2 North
04-JAN-2022          3 North
05-JAN-2022          0 South
06-JAN-2022          1 South
07-JAN-2022          2 South
08-JAN-2022          0 North

The pattern is a regular expression using the variables listed in the define clause.

So it's searching for:

Any row (init) - this is undefined so "always true"
Followed by zero or more same variables
These are rows where the value for dir matches that of the init "always true" row

The measures clause defines output columns. count(*) returns the running total of rows matched in the group so far.

The all rows per match clause is necessary because by default match_recognize returns one row per group. So you'd only get one row in the output for each streak.

How to get # of times in a row an item occurred in SQL

Answers (2)

Related Questions