Reputation: 85
I have a timetable in SQL Server that has the [SERV_ID]
(service-id), [STATION]
(station), [ARR]
(arrivaltime), [DEP]
(departuretime) of a public transport vehicle. Every Service can be present every day [SERV_DAY]
.
Target is to summarize Serviceday, Service-line, First-station, Last-station, and the corresponding timestamps. --> One row per service per day.
For [SERV_ID]
N170 this would be:
SERV_DAY SERV_ID FIRST_STATION MIN_DEP LAST_STATION MAX_ARR
2019-08-14 00:00:00 N170 Downtown 2019-08-14 06:06:00 CentralStation 2019-08-14 07:11:00
I tried to do this by partinioning thru ([SERV_DAY], [SERV_ID])
an then get MAX([ARR])
and MIN([DEP])
for each partition. This works so long, but now I want to get the corresponding Station to each Min and Max.
SELECT
[SERV_DAY],[SERV_ID],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP
FROM #demo
Later I need to add the delay at the last station, which is available in an extended version of the dataset as [ARR_EFFECTIVE]
and [DEP_EFFECTIVE]
. Hopefully I will be able to do add these attributes as soon as I know how to summarize the daily lines as described above.
This topic is close but I do not get how to adapt the "gap & island problem" Min() and Max() based on partition in sql server
I have set up a demo dataset in dbfiddle https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=52e53d43a49ddb8f67454e576bfa7d74
Can anyone help me to finalize the query?
Upvotes: 1
Views: 565
Reputation: 24593
SELECT
[SERV_DAY]
,[SERV_ID],
FIRST_VALUE(STATION) over (Partition by [SERV_DAY],[SERV_ID] Order by ARR DESC) Station1
, FIRST_VALUE(STATION) over (Partition by [SERV_DAY],[SERV_ID] Order by DEP ASC) Station2
FROM #demo
Upvotes: 1
Reputation: 2300
I think I would use a temp table instead of a CTE if you have a large amount of data, but here is a quick idea on how that should work:
WITH CTE AS
(
SELECT *
, ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY ARR ) RN
, ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY DEP ) RN2
from #demo
)
SELECT t1.[SERV_DAY],t1.[SERV_ID],t1.[STATION] FIRST_STATION, t1.[DEP] MIN_DEP, t2.STATION LAST_STATION
FROM CTE t1
INNER JOIN CTE t2 on t1.SERV_DAY = t2.SERV_DAY and t1.SERV_ID = t2.SERV_ID and t2.RN2 = 1
WHERE t1.RN = 1
Upvotes: 0
Reputation: 85
As reply to @casenonsensitive it works using his code and a little modification!
WITH T AS (
SELECT
[SERV_DAY], [SERV_ID], [STATION],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [ARR] ) AS RN_ARR,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [DEP] ) AS RN_DEP
FROM #demo
)
SELECT MAX(CASE WHEN RN_ARR = 1 THEN [STATION] END) MIN_DEP_STATION,
MAX(CASE WHEN RN_DEP = 1 THEN [STATION] END) MAX_ARR_STATION, [SERV_DAY], [SERV_ID], MAX_ARR, MIN_DEP from T
group by [SERV_DAY], [SERV_ID], MIN_DEP, MAX_ARR
Upvotes: 0
Reputation: 950
You can do that in two steps:
first add a row_number sorted by ARR
descending and another row_number sorted by dep
. Then you're able to filter on the rows with row_number
= 1 in order to select other columns.
Here's an example how to retrieve the station of the max_arr and the min_dep:
WITH T AS (
SELECT
[SERV_DAY], [SERV_ID],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [ARR] DESC) AS RN_ARR,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [DEP]) AS RN_DEP,
*
FROM #demo
)
SELECT MAX(CASE WHEN RN_ARR = 1 THEN [STATION] END) MAX_ARR_STATION,
MAX(CASE WHEN RN_DEP = 1 THEN [STATION] END) MIN_DEP_STATION,
*
FROM T
Upvotes: 0