Calibre2010
Calibre2010

Reputation: 3849

SQL - I am trying to insert months of a calendar year that are missing from a dataset

The dataset is structured as

    id,
    year,
    month 
    )

and sample data is as below

1, 2020, 01
1, 2020, 02
1, 2020, 11
2, 2020, 01
2, 2020, 02
2, 2020, 03
2, 2021, 01

I want to insert the missing months into the dataset per id and year, so the first

1, 2020, 01
1, 2020, 02
1, 2020, 11

is missing months

1, 2020,04 
1, 2020,05 
1, 2020,06
1, 2020,07 
1, 2020,08
1, 2020,09 and 
1, 2020,12 

I'm trying to insert these values

I have tried a few queries to check where the month and year are not in the set but am having no luck.

Upvotes: 0

Views: 117

Answers (1)

qaziqarta
qaziqarta

Reputation: 1944

Generate months for a full year for each id, left join with input and take what's empty on the right side of join.

  1. Redshift. Redshift's generate_series doesn't generate dates, and it's leader-node-only function, so can't use it while joining or inserting into tables. But we can generate months with a recursive query instead:
-- Generate months between min(year)-01-01 and max(year)-12-01 for each id
with recursive all_months(id, dt, max_dt) as (
  select 
    id, 
    to_date(min(year)::text || '-01', 'YYYY-MM') dt,
    to_date(max(year)::text || '-12', 'YYYY-MM') max_dt
  from test 
  group by id 
  union all 
  select 
    id,
    add_months(dt, 1)::date dt,
    max_dt
  from all_months m
  where dt < max_dt)
-- left join generated months with our data, 
-- and return months for which there's no corresponding joined month
select 
  a.id, 
  extract(year from a.dt) "year", 
  lpad(extract(month from a.dt)::text, 2, '0') "month"
from all_months a
left join test t 
on a.id = t.id and to_date(t.year::text || '-' || t.month::text, 'YYYY-MM') = a.dt
where t.id is null
order by id, year, month
  1. For Postgres 9.4 and above:
    with all_months as (
      select 
        id,
        generate_series(
          make_date(min(year), 1, 1),
          make_date(max(year), 12, 1),
          '1 month'::interval
        )::date dt
      from test
      group by id
    )
    select 
      a.id, 
      extract(year from a.dt) "year", 
      lpad(extract(month from a.dt)::text, 2, '0') "month"
    from all_months a
    left join test t 
    on a.id = t.id and make_date(t.year, t.month::int, 1) = a.dt
    where t.id is null
  1. For Postgres 9 and up to 9.3:
with all_months as (
  select 
    id,
    generate_series(
      to_date(min(year)::text || '-01', 'YYYY-MM'),
      to_date(max(year)::text || '-12', 'YYYY-MM'),
      '1 month'::interval
    )::date dt
  from test
  group by id
)
select 
  a.id, 
  extract(year from a.dt) "year", 
  lpad(extract(month from a.dt)::text, 2, '0') "month"
from all_months a
left join test t 
on a.id = t.id and to_date(t.year::text || '-' || t.month::text, 'YYYY-MM') = a.dt
where t.id is null

db<>fiddle here

Upvotes: 1

Related Questions