Mike
Mike

Reputation: 4405

SQL to Count Records that Existed by Decade

I have an Oracle a table that has 3 columns. FILE_NUM, START_DATE, END_DATE. FILE_NUM is a unique identifier and so each row is unique.

FILE_NUM START_DATE END_DATE
1111 Apr. 04, 1977 Jun. 30, 1984
1112 Jan. 03, 1987 Sep. 05, 2010
1113 May. 14, 1962 Jan. 30, 1998
1114 Oct. 09, 2001 Aug. 22, 2025
1113 May. 14, 1962 Jan. 30, 1998
1114 Nov. 18, 2003 Aug. 22, 2028
etc

I want to query a count of files that existed per decade. Something like:

Decade Count
1960-1969 10
1970-1979 25
1980-1989 42

The logic is, if the file existed at any point in any decade, count it in each of those decades. So for file_num 1112, it should get counted in the 1980's, 1990's, 2000's, and 2010's.

I don't even know if this is possible. I've tinkered around in my query a few times, but I'm not sure I know enough functions or technique to get this sort of result.

Thanks for any help.

EDIT:

I should clarify how the counts will be made. I need to approach the date ranges in 3 ways in order to capture which file_nums existed within each decade.

  1. If it started between the decade range, count it.
  2. If it ended between the decade range, count it.
  3. If it started before the decade range and ended after the decade range, count it.

I crafted up the following SQL. I am still trying to figure out if it's doing what I want it to do. If anyone has suggestion

with decades as
  (
  select
    concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) DECADE
  from 
    table t
  group by
    concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9)
  )

select
  count(a.file_num),
  decades.decade
from
  table a,
  decades
where 
  extract(year from a.start_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  or
  extract(year from a.end_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  or
  (extract(year from a.start_date) < extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY'))
   and
   extract(year from a.end_date) > extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  )  
group by
  decades.decade
order by
  decades.decade

Upvotes: 1

Views: 622

Answers (3)

Mahamoutou
Mahamoutou

Reputation: 1555

You can also take advantage of the connect by clause to do that. I assume you don't have any duplicate rows (these columns FILE_NUM, START_DATE, END_DATE should be unique) in your real data.

SELECT dec_start, dec_end, COUNT(*) nb
FROM (
    SELECT t.*, level
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 10 dec_start
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 1 dec_end
    FROM YourTable T
    CONNECT BY 
        10 * TRUNC( EXTRACT ( YEAR FROM (START_DATE) ) / 10 ) + 10 * LEVEL - 10 
            <   10 * CEIL( EXTRACT ( YEAR FROM (END_DATE) ) / 10 )
    AND PRIOR FILE_NUM = FILE_NUM
    AND PRIOR START_DATE = START_DATE
    AND PRIOR END_DATE = END_DATE
    AND PRIOR SYS_GUID() IS NOT NULL
)
group by dec_start, dec_end 
order by dec_start, dec_end
;

demo on db<>fiddle

Upvotes: 2

DRapp
DRapp

Reputation: 48139

You should be able to pull something together by casting the year of the date divided by 10 to an integer, then multiplying that by 10. Example...

2019 / 10 = 201.9 cast as integer = 201 * 10 = 2010

Then, add 9 years to above for the ending year

2019 + 10 = 2029 / 10 = 202.9 cast as integer = 202 * 10 = 2020 + 9 = 2029

So, all you should care about is the beginning date in question... something like

select
      concat( cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10, 
             concat( '-', 
             cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10 + 9)),
      count(*) as NumRecs
   from
      table m
   group by 1

Upvotes: 1

Yusuf
Yusuf

Reputation: 212

best option could be have a decade dimension table:

enter image description here

And do join with main table:

select d.decade, 
       count(file_num) as count 
  from table m
  join decade d
    on m.start_date >= d.decade_start_date
  join decade d2
    on m.end_date <= d2.decade_end_date
 where 1 = 1
   and d.decade = d2.decade 
 group by 1

Upvotes: 0

Related Questions