Generate private SEQUENCE for each PARTITION

Question

I have a table of match ups in different games, and I would like to calculate how dense the matchup space in regards to each game is. Example table:

id | game  | start_dt
---+-------+-----------------
1  | dota2 | 2020-01-01 15:00
---+-------+-----------------
2  | dota2 | 2020-01-01 15:05
---+-------+-----------------
3  | dota2 | 2020-01-01 18:00
---+-------+-----------------
4  | cs-go | 2020-01-01 13:05
---+-------+-----------------
5  | cs-go | 2020-01-01 13:15
---+-------+-----------------
6  | dota2 | 2020-01-01 12:00
---+-------+-----------------
7  | cs-go | 2020-01-01 14:45

Would ideally yield:

id | game  | start_dt        |  time_group_id
---+-------+-----------------+---------------
6  | dota2 | 2020-01-01 12:00| 1
---+-------+-----------------+---------------
1  | dota2 | 2020-01-01 15:00| 2
---+-------+-----------------+---------------
2  | dota2 | 2020-01-01 15:05| 2
---+-------+-----------------+---------------
3  | dota2 | 2020-01-01 18:00| 3
---+-------+-----------------+---------------
4  | cs-go | 2020-01-01 13:05| 4
---+-------+-----------------+---------------
5  | cs-go | 2020-01-01 13:15| 4
---+-------+-----------------+---------------
7  | cs-go | 2020-01-01 14:45| 5

Which basically means, that if a gap between the next game and the previous one is less or equal to 10 minutes, they are considered in the same time group. Else they are different time groups and it proceeds.

Those time_group_ids are then used to map useful information about matches and their time frequency.

My code is below, and it serves the purpose ideally, however, it doesn't give evenly spaced ids, so I have to use a composite of game VARCHAR and group_id for the field to uniquely represent a group. Please, run it in the dbfiddle to understand, what I mean.

CREATE TABLE fight(
   id BIGSERIAL PRIMARY KEY,
   date TIMESTAMP NOT NULL,
   game VARCHAR NOT NULL
);

INSERT INTO fight(date, game) 
VALUES 
('2020-01-01 15:00'::TIMESTAMP, 'dota2'), 
('2020-01-01 15:05'::TIMESTAMP, 'dota2'), 
('2020-01-01 18:00'::TIMESTAMP, 'dota2'), 
('2020-01-01 13:05'::TIMESTAMP, 'cs-go'), 
('2020-01-01 13:15'::TIMESTAMP, 'cs-go'),
('2020-01-01 12:00'::TIMESTAMP, 'dota2'),
('2020-01-01 14:45'::TIMESTAMP, 'cs-go');

SELECT * FROM fight;

CREATE SEQUENCE seq START 1 CACHE 1;

SELECT
a.id,
a.game,
a.start_dt,
(CASE WHEN (a.start_dt - INTERVAL '10 min' <= a.prev_start_dt) THEN currval('seq')
     ELSE nextval('seq')
END)::VARCHAR || '|' || a.game AS time_group_id
FROM
(
SELECT 
   fight.id, 
   fight.game,
   fight.date AS start_dt,
   LAG (fight.date, 1, fight.date) OVER (PARTITION BY fight.game ORDER BY fight.date) AS prev_start_dt
FROM fight CROSS JOIN (SELECT setval('seq', 1)) s
) a
ORDER BY a.game, a.start_dt;

The question is: is there the ideal way to do this, or should I stick with what I got?

Gordon Linoff · Accepted Answer

You don't need a sequence for this, just a cumulative sum:

SELECT f.*,
       COUNT(*) FILTER (WHERE prev_date < date - interval '10 min') OVER (ORDER BY date) as time_group_id
FROM (SELECT f.*,
             LAG(f.date) OVER (PARTITION BY f.game ORDER BY f.date) AS prev_date
      FROM fight f
     ) f;

Notes: This might start at 0 rather than 1. If that makes a difference, use 1 +.

This produces a number, not a string. You can convert to a string (using ::text) if that is what you really need.

Here is a db<>fiddle

Generate private SEQUENCE for each PARTITION

Answers (2)

Related Questions