Reputation: 353
Subway data Columns:
Occupancy at current station = Number of people already in the train at previous station + Number of people boarded - Number of people deboarded
I am trying to fill the occupancy column. The issue is that the dataset is for multiple subway trains so station_id changes back to 1 and on that station number of people deboarded is always 0 since it is the station from which train journey begins. I have got no clue on how to do this in postgresql. The occupancy column in the sample image below is empty and needs to be filled
The train journeys are sorted and grouped.
Upvotes: 0
Views: 939
Reputation: 1269483
You can do this with the difference of the cumulative sums. The trick is identifying the groups, which I'll do by counting the number of times that station_id
has been 1
up to that record.
select s.*,
(sum(boarded) over (partition by grp order by id) -
sum(deboarded) over (partition by grp order by id)
) as occupants
from (select s.*,
count(*) filter (where station_id = 1) over (order by id) as grp
from subwaydata s
) s;
Upvotes: 3