gudvinr
gudvinr

Reputation: 124

Count number of sequential records based on condition

I have a table that has number of rows that originally sorted by timestamp:

+----+------------+-----+
| id | date       | foo |
+----+------------+-----+
| 1  | 2017-12-28 | abc |
+----+------------+-----+
| 1  | 2017-12-27 | abc |
+----+------------+-----+
| 2  | 2017-12-27 | xyz |
+----+------------+-----+
| 2  | 2017-12-26 | xyz |
+----+------------+-----+
| 2  | 2017-12-25 | abc |
+----+------------+-----+
| 1  | 2017-12-25 | abc |
+----+------------+-----+
| 2  | 2017-12-25 | abc |
+----+------------+-----+

And I want to get number of sequential records with same foo for different id like that:

+----+-----+-------+
| id | foo | count |
+----+-----+-------+
| 1  | abc | 2     |
+----+-----+-------+
| 2  | xyz | 2     |
+----+-----+-------+
| 2  | abc | 1     |
+----+-----+-------+
| 1  | abc | 1     |
+----+-----+-------+
| 2  | abc | 1     |
+----+-----+-------+

So, here is sqlfiddle with built schema.

Window functions look like a key for such kind of problems but it didn't worked out well in the way I used.

I'll be glad to get any help or at least some hints that can be useful. There is some questions on MySQL related to this but they aren't very helpful though.

Upvotes: 0

Views: 178

Answers (1)

Kaushik Nayak
Kaushik Nayak

Reputation: 31648

First of all, thanks a lot for your sqlfiddle.

Use the standard approach(Tabibitosan) for gaps and islands problem using row_number()

SQL Fiddle

PostgreSQL 9.6 Schema Setup:

create table bar (
  id   bigint not null,
  date timestamp without time zone,
  foo  text
);

insert into bar (id, date, foo) values
  (1, '2017-12-28 17:54:02', 'abc'),
  (1, '2017-12-28 17:53:30', 'abc'),
  (2, '2017-12-28 17:50:13', 'xyz'),
  (2, '2017-12-28 17:44:35', 'xyz'),
  (2, '2017-12-28 17:30:00', 'abc'),
  (1, '2017-12-28 17:25:15', 'abc'),
  (2, '2017-12-28 17:21:20', 'abc');

Query 1:

SELECT MAX (id) AS id,
         foo,
         COUNT (*) AS "count"
    FROM (SELECT b.*,
                   ROW_NUMBER () OVER (ORDER BY date DESC)
                 - ROW_NUMBER () OVER (PARTITION BY id ORDER BY date DESC)
                    seq
            FROM bar b) t
GROUP BY foo, seq, id
ORDER BY MAX(DATE) DESC

Results:

| id | foo | count |
|----|-----|-------|
|  1 | abc |     2 |
|  2 | xyz |     2 |
|  2 | abc |     1 |
|  1 | abc |     1 |
|  2 | abc |     1 |

Upvotes: 1

Related Questions