user43080
user43080

Reputation: 3

Forcing rows to appear despite data missing

I have a large data set (called input below) which contains a variety of information such as sales dates, transaction dates, payments, sales.

The user can produce a report by year, quarter or month to show the amount of payment in a particular month/quarter/year of a certain sales year. So you could have a payment 5 years after the initial sale, i.e. a payment in 2016 relating to the sales year of 2011.

The user can decide whether they want to have these payment periods by month/quarter/year at the beginning of the code through the use of macro variables (i.e. %let ReportBasis = Year) and ReportBasis can be called on through the rest of the code without manual adjustments.

The report is produced using:

proc sql;
create table report as
select sales_year, &ReportBasis, Sum(Sales) as Sales
from input
group by sales_year, &ReportBasis;
quit;

Now the issue I am having is that if there is no payment in a particular period for all sales years in question, then there is no row for that period. This produces a layout problem.

I require a solution to this which can be dynamic due to the nature of the macro variable (changing from year to month to quarter).

So I have this (example):

2011    Month 11       100
2011    Month 12       250
2011    Month 13       85
2011    Month 15       90
2011    Month 16       300

But I require this:

2011    Month 11       100
2011    Month 12       250
2011    Month 13       85
2011    Month 14       0
2011    Month 15       90
2011    Month 16       300

where there is no actual payment in month 14 in all of my data (even other years 2012, 2013 etc.), so it doesn't show up in my first table, but the second table still cleverly knows to include it.

Thanks in advance.

Upvotes: 0

Views: 64

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

Here is one solution that assumes that all years and reports are represented in input, although not all combinations:

create table report as
    select sy.sales_year, r.rb, Sum(i.Sales) as Sales
    from (select distinct sales_year from input) sy cross join
         (select distinct &ReportBasis as rb from input) r left join
         input i
         on i.sales_year = sy.sales_year and i.&ReportBasis = r.&ReportBasis
    group by sy.sales_year, r.rb;

Upvotes: 1

Related Questions