Reputation: 4405
I have an Oracle a table that has 3 columns. FILE_NUM, START_DATE, END_DATE. FILE_NUM is a unique identifier and so each row is unique.
FILE_NUM | START_DATE | END_DATE |
---|---|---|
1111 | Apr. 04, 1977 | Jun. 30, 1984 |
1112 | Jan. 03, 1987 | Sep. 05, 2010 |
1113 | May. 14, 1962 | Jan. 30, 1998 |
1114 | Oct. 09, 2001 | Aug. 22, 2025 |
1113 | May. 14, 1962 | Jan. 30, 1998 |
1114 | Nov. 18, 2003 | Aug. 22, 2028 |
etc |
I want to query a count of files that existed per decade. Something like:
Decade | Count |
---|---|
1960-1969 | 10 |
1970-1979 | 25 |
1980-1989 | 42 |
The logic is, if the file existed at any point in any decade, count it in each of those decades. So for file_num 1112, it should get counted in the 1980's, 1990's, 2000's, and 2010's.
I don't even know if this is possible. I've tinkered around in my query a few times, but I'm not sure I know enough functions or technique to get this sort of result.
Thanks for any help.
EDIT:
I should clarify how the counts will be made. I need to approach the date ranges in 3 ways in order to capture which file_nums existed within each decade.
I crafted up the following SQL. I am still trying to figure out if it's doing what I want it to do. If anyone has suggestion
with decades as
(
select
concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) DECADE
from
table t
group by
concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9)
)
select
count(a.file_num),
decades.decade
from
table a,
decades
where
extract(year from a.start_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
or
extract(year from a.end_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
or
(extract(year from a.start_date) < extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY'))
and
extract(year from a.end_date) > extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
)
group by
decades.decade
order by
decades.decade
Upvotes: 1
Views: 622
Reputation: 1555
You can also take advantage of the connect by clause to do that. I assume you don't have any duplicate rows (these columns FILE_NUM, START_DATE, END_DATE should be unique) in your real data.
SELECT dec_start, dec_end, COUNT(*) nb
FROM (
SELECT t.*, level
, 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 10 dec_start
, 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 1 dec_end
FROM YourTable T
CONNECT BY
10 * TRUNC( EXTRACT ( YEAR FROM (START_DATE) ) / 10 ) + 10 * LEVEL - 10
< 10 * CEIL( EXTRACT ( YEAR FROM (END_DATE) ) / 10 )
AND PRIOR FILE_NUM = FILE_NUM
AND PRIOR START_DATE = START_DATE
AND PRIOR END_DATE = END_DATE
AND PRIOR SYS_GUID() IS NOT NULL
)
group by dec_start, dec_end
order by dec_start, dec_end
;
Upvotes: 2
Reputation: 48139
You should be able to pull something together by casting the year of the date divided by 10 to an integer, then multiplying that by 10. Example...
2019 / 10 = 201.9 cast as integer = 201 * 10 = 2010
Then, add 9 years to above for the ending year
2019 + 10 = 2029 / 10 = 202.9 cast as integer = 202 * 10 = 2020 + 9 = 2029
So, all you should care about is the beginning date in question... something like
select
concat( cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10,
concat( '-',
cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10 + 9)),
count(*) as NumRecs
from
table m
group by 1
Upvotes: 1
Reputation: 212
best option could be have a decade dimension table:
And do join with main table:
select d.decade,
count(file_num) as count
from table m
join decade d
on m.start_date >= d.decade_start_date
join decade d2
on m.end_date <= d2.decade_end_date
where 1 = 1
and d.decade = d2.decade
group by 1
Upvotes: 0