Reputation: 504
I have a table with data such as below
Group Start Date End Date
A 01/01/01 01/03/01
A 01/01/01 01/02/01
A 01/03/01 01/04/01
B 01/01/01 01/01/01
ETC
I am looking to produce a view that gives a count for each day, like this
Group Date Count
A 01/01/01 2
A 01/02/01 2
A 01/03/01 2
A 01/04/01 1
B 01/01/01 1
I am using Oracle 9 and am at a total loss on what how to handle this and am looking for any idea to get me started.
Note: Generating a table to hold the dates is not practical because I final product has to break down to the minute.
Upvotes: 2
Views: 9802
Reputation: 425803
WITH q AS
(
SELECT (
SELECT MIN(start_date)
FROM mytable
) + level - 1 AS mydate
FROM dual
CONNECT BY
level <= (
SELECT MAX(end_date) - MIN(start_date)
FROM mytable
)
)
SELECT group, mydate,
(
SELECT COUNT(*)
FROM mytable mi
WHERE mi.group = mo.group
AND q BETWEEN mi.start_date AND mi.end_date
)
FROM q
CROSS JOIN
(
SELECT DISTINCT group
FROM mytable
) mo
Update:
A better and faster query making use of analytic functions.
The main idea is that the number of ranges containing each date is the difference before the count of ranges started before that date and the count of ranges that ended before it.
SELECT cur_date,
grouper,
SUM(COALESCE(scnt, 0) - COALESCE(ecnt, 0)) OVER (PARTITION BY grouper ORDER BY cur_date) AS ranges
FROM (
SELECT (
SELECT MIN(start_date)
FROM t_range
) + level - 1 AS cur_date
FROM dual
CONNECT BY
level <=
(
SELECT MAX(end_date)
FROM t_range
) -
(
SELECT MIN(start_date)
FROM t_range
) + 1
) dates
CROSS JOIN
(
SELECT DISTINCT grouper AS grouper
FROM t_range
) groups
LEFT JOIN
(
SELECT grouper AS sgrp, start_date, COUNT(*) AS scnt
FROM t_range
GROUP BY
grouper, start_date
) starts
ON sgrp = grouper
AND start_date = cur_date
LEFT JOIN
(
SELECT grouper AS egrp, end_date, COUNT(*) AS ecnt
FROM t_range
GROUP BY
grouper, end_date
) ends
ON egrp = grouper
AND end_date = cur_date - 1
ORDER BY
grouper, cur_date
This query completes in 1
second on 1,000,000
rows.
See this entry in my blog for more detail:
Upvotes: 3
Reputation: 67792
You could use the method described in these SO:
Basically: join with a generated calendar and GROUP BY your subset of columns.
SQL> WITH DATA AS (
2 SELECT 'A' grp, to_date('01/01/01') start_date, to_date('01/03/01') end_date FROM DUAL
3 UNION ALL SELECT 'A', to_date('01/01/01'), to_date('01/02/01') FROM DUAL
4 UNION ALL SELECT 'A', to_date('01/03/01'), to_date('01/04/01') FROM DUAL
5 UNION ALL SELECT 'B', to_date('01/01/01'), to_date('01/01/01') FROM DUAL
6 ), calendar AS (
7 SELECT to_date('01/01/01') + ROWNUM - 1 d
8 FROM dual
9 CONNECT BY LEVEL <= to_date('01/04/01') - to_date('01/01/01') + 1
10 )
11 SELECT data.grp, calendar.d, COUNT(*) cnt
12 FROM data
13 JOIN calendar ON calendar.d BETWEEN data.start_date AND data.end_date
14 GROUP BY data.grp, calendar.d;
GRP D CNT
--- ----------- ----------
A 04/01/2001 1
A 02/01/2001 2
B 01/01/2001 1
A 03/01/2001 2
A 01/01/2001 2
Upvotes: 2
Reputation: 89741
Typically I solve this kind of problem with a numbers table:
WITH Dates AS (
SELECT DateAdd(d, Numbers.Number - 1, '1/1/2001') AS Date
FROM Numbers
WHERE Numbers.Number BETWEEN 1 AND 100000 -- Arbitrary date range
)
SELECT GroupData.Group, Dates.Date, COUNT(*)
FROM Dates
LEFT JOIN GroupData
ON Dates.Date BETWEEN GroupData.StartDate AND GroupData.EndDate
GROUP BY GroupData.Group, Dates.Date
ORDER BY GroupData.Group, Dates.Date
Upvotes: 1