I have an Oracle a table that has 3 columns. FILE_NUM, START_DATE, END_DATE. FILE_NUM is a unique identifier and so each row is unique. FILE_NUM START_DATE END_DATE 1111 Apr. 04, 1977 Jun. 30, 1984 1112 Jan. 03, 1987 Sep. 05, 2010 1113 May. 14, 1962 Jan. 30, 1998 1114 Oct. 09, 2001 Aug. 22, 2025 1113 May. 14, 1962 Jan. 30, 1998 1114 Nov. 18, 2003 Aug. 22, 2028 etc I want to query a count of files that existed per decade. Something like: Decade Count 1960-1969 10 1970-1979 25 1980-1989 42 The logic is, if the file existed at any point in any decade, count it in each of those decades. So for file_num 1112, it should get counted in the 1980's, 1990's, 2000's, and 2010's. I don't even know if this is possible. I've tinkered around in my query a few times, but I'm not sure I know enough functions or technique to get this sort of result. Thanks for any help. EDIT: I should clarify how the counts will be made. I need to approach the date ranges in 3 ways in order to capture which file_nums existed within each decade. If it started between the decade range, count it. If it ended between the decade range, count it. If it started before the decade range and ended after the decade range, count it. I crafted up the following SQL. I am still trying to figure out if it's doing what I want it to do. If anyone has suggestion with decades as ( select concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) DECADE from table t group by concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) ) select count(a.file_num), decades.decade from table a, decades where extract(year from a.start_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY')) or extract(year from a.end_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY')) or (extract(year from a.start_date) < extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from a.end_date) > extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY')) ) group by decades.decade order by decades.decade

Reputation: 4435

SQL to Count Records that Existed by Decade

I have an Oracle a table that has 3 columns. FILE_NUM, START_DATE, END_DATE. FILE_NUM is a unique identifier and so each row is unique.

FILE_NUM	START_DATE	END_DATE
1111	Apr. 04, 1977	Jun. 30, 1984
1112	Jan. 03, 1987	Sep. 05, 2010
1113	May. 14, 1962	Jan. 30, 1998
1114	Oct. 09, 2001	Aug. 22, 2025
1113	May. 14, 1962	Jan. 30, 1998
1114	Nov. 18, 2003	Aug. 22, 2028
etc

I want to query a count of files that existed per decade. Something like:

Decade	Count
1960-1969	10
1970-1979	25
1980-1989	42

The logic is, if the file existed at any point in any decade, count it in each of those decades. So for file_num 1112, it should get counted in the 1980's, 1990's, 2000's, and 2010's.

I don't even know if this is possible. I've tinkered around in my query a few times, but I'm not sure I know enough functions or technique to get this sort of result.

Thanks for any help.

EDIT:

I should clarify how the counts will be made. I need to approach the date ranges in 3 ways in order to capture which file_nums existed within each decade.

If it started between the decade range, count it.
If it ended between the decade range, count it.
If it started before the decade range and ended after the decade range, count it.

I crafted up the following SQL. I am still trying to figure out if it's doing what I want it to do. If anyone has suggestion

with decades as
  (
  select
    concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9) DECADE
  from 
    table t
  group by
    concat(concat(floor(extract(year from t.start_date)/10)*10, '-'), floor(extract(year from t.start_date)/10)*10 + 9)
  )

select
  count(a.file_num),
  decades.decade
from
  table a,
  decades
where 
  extract(year from a.start_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  or
  extract(year from a.end_date) between extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY')) and extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  or
  (extract(year from a.start_date) < extract(year from to_date(SUBSTR(decades.decade,0,4), 'YYYY'))
   and
   extract(year from a.end_date) > extract(year from to_date(SUBSTR(decades.decade,6,9), 'YYYY'))
  )  
group by
  decades.decade
order by
  decades.decade

Upvotes: 1

Answers (3)

Mahamoutou

Reputation: 1555

You can also take advantage of the connect by clause to do that. I assume you don't have any duplicate rows (these columns FILE_NUM, START_DATE, END_DATE should be unique) in your real data.

SELECT dec_start, dec_end, COUNT(*) nb
FROM (
    SELECT t.*, level
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 10 dec_start
      , 10 * trunc( extract ( year from (START_DATE) ) / 10 ) + 10 * LEVEL - 1 dec_end
    FROM YourTable T
    CONNECT BY 
        10 * TRUNC( EXTRACT ( YEAR FROM (START_DATE) ) / 10 ) + 10 * LEVEL - 10 
            <   10 * CEIL( EXTRACT ( YEAR FROM (END_DATE) ) / 10 )
    AND PRIOR FILE_NUM = FILE_NUM
    AND PRIOR START_DATE = START_DATE
    AND PRIOR END_DATE = END_DATE
    AND PRIOR SYS_GUID() IS NOT NULL
)
group by dec_start, dec_end 
order by dec_start, dec_end
;

demo on db<>fiddle

Upvotes: 2

DRapp

Reputation: 48179

You should be able to pull something together by casting the year of the date divided by 10 to an integer, then multiplying that by 10. Example...

2019 / 10 = 201.9 cast as integer = 201 * 10 = 2010

Then, add 9 years to above for the ending year

2019 + 10 = 2029 / 10 = 202.9 cast as integer = 202 * 10 = 2020 + 9 = 2029

So, all you should care about is the beginning date in question... something like

select
      concat( cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10, 
             concat( '-', 
             cast( EXTRACT(YEAR FROM m.start_date ) / 10 as int ) * 10 + 9)),
      count(*) as NumRecs
   from
      table m
   group by 1

Upvotes: 1

Yusuf

Reputation: 212

best option could be have a decade dimension table:

And do join with main table:

select d.decade, 
       count(file_num) as count 
  from table m
  join decade d
    on m.start_date >= d.decade_start_date
  join decade d2
    on m.end_date <= d2.decade_end_date
 where 1 = 1
   and d.decade = d2.decade 
 group by 1

Upvotes: 0

SQL to Count Records that Existed by Decade

Answers (3)

Related Questions