Reputation: 453
I have a table with many IDs and many dates associated with each ID, and even a few IDs with no date. For each ID and date combination, I want to select the ID, date, and the next largest date also associated with that same ID, or null as next date if none exists.
Sample Table:
ID Date
1 5/1/10
1 6/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10
4 4/15/10
4
Desired Output:
ID Date Next_Date
1 5/1/10 6/1/10
1 6/1/10 7/1/10
1 7/1/10
2 6/15/10
3 8/15/10
3 8/15/10
4 4/1/10 4/15/10
4 4/15/10
Upvotes: 13
Views: 33959
Reputation: 1637
I think self JOIN
would be faster than subselect.
WITH dates AS (
SELECT 1 AS ID, '2010-05-01' AS Date
UNION ALL SELECT 1, '2010-06-01'
UNION ALL SELECT 1, '2010-07-01'
UNION ALL SELECT 2, '2010-06-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 3, '2010-08-15'
UNION ALL SELECT 4, '2010-04-01'
UNION ALL SELECT 4, '2010-04-15'
UNION ALL SELECT 4, ''
)
SELECT
dates.ID,
dates.Date,
nextDates.Date AS Next_Date
FROM
dates
LEFT JOIN
dates nextDates
ON nextDates.ID = dates.ID
AND nextDates.Date > dates.Date
LEFT JOIN
dates noLower
ON noLower.ID = nextDates.ID
AND noLower.Date < nextDates.Date
AND noLower.Date > dates.Date
WHERE
dates.Date > 0
AND noLower.ID IS NULL
https://www.db-fiddle.com/f/4sWRLt2hxjik5HqiJ21ez8/1
Upvotes: 0
Reputation: 34187
SELECT
mytable.id,
mytable.date,
(
SELECT
MIN(mytablemin.date)
FROM mytable AS mytablemin
WHERE mytablemin.date > mytable.date
AND mytable.id = mytablemin.id
) AS NextDate
FROM mytable
This has been tested on SQL Server 2008 R2 (but it should work on other DBMSs) and produces the following output:
id date NextDate ----------- ----------------------- ----------------------- 1 2010-05-01 00:00:00.000 2010-06-01 00:00:00.000 1 2010-06-01 00:00:00.000 2010-06-15 00:00:00.000 1 2010-07-01 00:00:00.000 2010-08-15 00:00:00.000 2 2010-06-15 00:00:00.000 2010-07-01 00:00:00.000 3 2010-08-15 00:00:00.000 NULL 3 2010-08-15 00:00:00.000 NULL 4 2010-04-01 00:00:00.000 2010-04-15 00:00:00.000 4 2010-04-15 00:00:00.000 2010-05-01 00:00:00.000 4 NULL NULL
Update 1: For those that are interested, I've compared the performance of the two variants in SQL Server 2008 R2 (one uses MIN aggregate and the other uses TOP 1 with an ORDER BY):
Without an index on the date column, the MIN version had a cost of 0.0187916 and the TOP/ORDER BY version had a cost of 0.115073 so the MIN version was "better".
With an index on the date column, they performed identically.
Note that this was testing with just these 9 records so the results could be (very) spurious...
Update 2: The results hold for 10,000 uniformly distributed random records. The TOP/ORDER BY query takes so long to run at 100,000 records I had to cancel it and give up.
Upvotes: 21
Reputation: 1042
SELECT
id,
date,
( SELECT date
FROM table t1
WHERE t1.date > t2.date
ORDER BY t1.date LIMIT 1 )
FROM table t2
Upvotes: 1
Reputation: 6866
If your db is oracle, you can use lead() and lag()
functions.
SELECT id, date,
LEAD(date, 1, 0) OVER (PARTITION BY ID ORDER BY Date DESC NULLS LAST) NEXT_DATE,
FROM Your_table
ORDER BY ID;
Upvotes: 1