Reputation: 75
I'm using postgreSQL 8.0 and I have a table with user_id, timestamp, and event_id.
How can I return the rows (or row) after the 4th occurrence of event_id = someID per user?
|---------------------|--------------------|------------------|
| user_id | timestamp | event_id |
|---------------------|--------------------|------------------|
| 1 | 2020-04-02 12:00 | 11 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 13:00 | 11 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 14:00 | 99 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 15:00 | 11 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 16:00 | 11 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 17:00 | 11 |
|---------------------|--------------------|------------------|
| 2 | 2020-04-02 17:00 | 11 |
|---------------------|--------------------|------------------|
Ie if event_id = 11, I would only want the last row in the table above.
Upvotes: 1
Views: 548
Reputation: 75
sorry to be asking about such an old version of Postgres, here is an answer that worked:
WITH EventOrdered AS(
SELECT
EventTypeId
, UserId
, Timestamp
, ROW_NUMBER() OVER (PARTITION BY EventTypeId, UserId ORDER BY Timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) ROW_NO
FROM Event),
FourthEvent AS (
SELECT DISTINCT
UserID
, FIRST_VALUE(TimeStamp) OVER (PARTITION BY UserId ORDER BY Timestamp) FirstFourthEventTimestamp
FROM EventOrdered
WHERE ROW_NO = 4)
SELECT e.*
FROM Event e
JOIN FourthEvent ffe
ON e.UserId = ffe.UserId
AND e.Timestamp > ffe.FirstFourthEventTimestamp
ORDER BY e.UserId, e.Timestamp
Upvotes: 0
Reputation: 1269873
You can use a cumulative count. This version includes the 4th occurrence:
select t.*
from (select t.*,
count(*) filter (where event_id = 11) over (partition by user_id order by timestamp) as event_11_cnt
from t
) t
where event_11_cnt >= 4;
The filter
has been valid Postgres syntax for a long time, but instead, you can use:
select t.*
from (select t.*,
sum( (event_id = 11)::int ) over (partition by user_id order by timestamp) as event_11_cnt
from t
) t
where event_11_cnt >= 4;
This version does not:
where event_11_cnt > 4 or (event_11_cnt = 4 and event_id <> 11)
An alternative method:
select t.*
from t
where t.timestamp > (select t2.timestamp
from t t2
where t2.user_id = t.user_id and
t2.event_id = 11
order by t2.timestamp
limit 1 offset 3
);
Upvotes: 0
Reputation: 222482
You can use window functions:
select *
from (
select t.*, row_number() over(partition by user_id, event_id order by timestamp) rn
from mytable t
) t
where rn > 4
Here is a little trick that removes the row number from the result:
select (t).*
from (
select t, row_number() over(partition by user_id, event_id order by timestamp) rn
from mytable t
) x
where rn > 4
Upvotes: 0