Dominik
Dominik

Reputation: 307

Time window in PostgreSQL

I am new to PostgreSQL (specifically, I use Timescale db) and have a question regarding time windows.

Data:

date      |customerid|names   
2014-01-01|1         |Andrew 
2014-01-02|2         |Pete   
2014-01-03|2         |Andrew 
2014-01-04|2         |Steve  
2014-01-05|2         |Stef   
2014-01-06|3         |Stef  
2014-01-07|1         |Jason 
2014-01-08|1         |Jason 

The question is: Going back in time x days (viewed from every single row), how many distinct names are there which share the same id?

For x=2 days, the result should look like this:

date      |customerid|names  |count 
2014-01-01|1         |Andrew |1 
2014-01-02|2         |Pete   |1 
2014-01-03|2         |Andrew |2 
2014-01-04|2         |Steve  |3 
2014-01-05|2         |Stef   |3 
2014-01-06|3         |Stef   |1
2014-01-07|1         |Jason  |1
2014-01-08|1         |Jason  |1  

Is this possible in PostgreSQL without using a loop over each single row?

Additional information: The time intervals of the data are not equidistant in reality.

Thank you very much!

Upvotes: 4

Views: 2339

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269445

It would be nice if you could use window functions:

select t.*,
       count(distinct name) over (partition by id
                                  order by date
                                  range between interval 'x day' preceding and current row
                                 ) as cnt_x
from t;

Alas, that is not possible. So you can use a lateral join:

select t.*, tt.cnt_x
from t left join lateral
     (select count(distinct t2.name) as cnt_x
      from t t2
      where t2.id = t.id and
             t2.date >= t.date - interval 'x day' and t2.date <= t.date
     ) tt
     on true;

For performance, you want an index on (id, date, name).

Upvotes: 6

Related Questions