Leo
Leo

Reputation: 86

Trying to add missing quarters to Name and Product in R

I have a dataset similar to this sample except with several thousand lines with more different Names and Products:

Name <- c(rep("Tom",24),
      rep("Jake",8),
      rep("Jill",5))
Product <- c(rep("Orange",12),rep("Apples",12),
         rep("Orange",8),rep("Apples",5))
Quarter <- c("2013-03-31","2013-06-30","2013-09-30","2013-12-31",
         "2014-03-31","2014-06-30","2014-09-30","2014-12-31",
         "2015-03-31","2015-06-30","2015-09-30","2015-12-31",
         "2013-03-31","2013-06-30","2013-09-30","2013-12-31",
         "2014-03-31","2014-06-30","2014-09-30","2014-12-31",
         "2015-03-31","2015-06-30","2015-09-30","2015-12-31",
         "2013-12-31","2014-03-31","2014-06-30","2014-09-30",
         "2014-12-31","2015-03-31","2015-06-30","2015-09-30",
         "2014-09-30","2014-12-31","2015-03-31","2015-06-30",
         "2015-12-31")
sample <- cbind(Name,Product,Quarter)

I am trying to expand the quarters to fill in the missing quarters for each Name and Product from 2013-03-31 all the way to 2015-12-31.

I have tried using the expand.grid function to add the missing quarters, however, it just creates all combinations using the factors. For example using

expanded <- expand.grid(Name = unique(sample$Name),
                    Product = unique(sample$Product),
                    Quarter = unique(sample$Quarter)) 

Forces Jake to have products Apple when he should have only oranges and Jill to include Oranges when she should only have apples.

Preferably I would want the data set to looks like this:

Name <- c(rep("Tom",24),
      rep("Jake",12),
      rep("Jill",12))
Product <- c(rep("Orange",12),rep("Apples",12),
         rep("Orange",12),rep("Apples",12))
Quarter <- c(rep(c("2013-03-31","2013-06-30","2013-09-30","2013-12-31",
         "2014-03-31","2014-06-30","2014-09-30","2014-12-31",
         "2015-03-31","2015-06-30","2015-09-30","2015-12-31"),4))
preferred <- cbind(Name,Product,Quarter)

Any help is appreciated. Thanks!

Upvotes: 1

Views: 58

Answers (1)

Steven Beaupr&#233;
Steven Beaupr&#233;

Reputation: 21621

First we convert sample to a data.frame

df <- data.frame(sample)

Then we use expand() from the tidyr package:

tidyr::expand(df, nesting(Name, Product), Quarter)

Which gives:

#Source: local data frame [48 x 3]
#
#     Name Product    Quarter
#   (fctr)  (fctr)     (fctr)
#1    Jake  Orange 2013-03-31
#2    Jake  Orange 2013-06-30
#3    Jake  Orange 2013-09-30
#4    Jake  Orange 2013-12-31
#5    Jake  Orange 2014-03-31
#6    Jake  Orange 2014-06-30
#7    Jake  Orange 2014-09-30
#8    Jake  Orange 2014-12-31
#9    Jake  Orange 2015-03-31
#10   Jake  Orange 2015-06-30
#..    ...     ...        ...

Upvotes: 1

Related Questions