Reputation: 73
ORIGINAL DATA
Person month Year Quantity
A 02 2018 900
A 04 2018 600
A 09 2018 300
A 04 2019 40
B 06 2018 56
B 01 2019 10
B 09 2019 20
REQUIRED OUTPUT
Person month Year Quantity
A 01 2018 0
A 02 2018 900
A 03 2018 0
A 04 2018 600
A 05 2018 0
A 06 2018 0
A 07 2018 0
A 08 2018 0
A 09 2018 300
A 10 2018 0
A 11 2018 0
A 12 2018 0
A 01 2019 0
A 01 2019 0
A 02 2019 0
A 03 2019 0
A 04 2019 40
A 05 2019 0
A 06 2019 0
A 07 2019 0
A 08 2019 0
A 09 2019 0
A 10 2019 0
A 11 2019 0
A 12 2019 0
B 01 2018 0
B 02 2018 0
B 03 2018 0
B 04 2018 0
B 05 2018 0
B 06 2018 56
B 07 2018 0
B 08 2018 0
B 09 2018 0
B 10 2018 0
B 11 2018 0
B 12 2018 0
B 01 2019 10
B 02 2019 0
B 03 2019 0
B 04 2019 0
B 05 2019 0
B 06 2019 0
B 07 2019 0
B 08 2019 0
B 09 2019 20
B 10 2019 0
B 11 2019 0
B 12 2019 0
Hello All,
I have the sample data above in a R data frame and I would like to add 0 "quantities" for each month where a data is missing for a "person". I've got missing months and year as you can see which I will like insert to my data frame for for each person. i.e. add the months-year values of 0 demand from a person, for the missing months. (Just to give some background , the initial data I had , had date for each demand but i grouped that in months-year as I extracted the month and year values from date field like below....
PersonMonthDemand <-
DateData %>%
mutate(month = month(DateOfDemand))%>%
mutate(year= year(DateOfDemand))%>%
group_by(Person, month, year) %>%
summarise(Quantity = sum(Quantity))
which resulted in the ORIGINAL data as shown above)
Will appreciate if some experts here can help please. Thanks in advance. Kind Regards, Gayatri
Upvotes: 1
Views: 60
Reputation: 26
Create a left-hand side base table.
To do this, take the unique values of all customers:
cstmr<-data.frame(unique(orgn_data$Person))
change column names
names(cstmr)[1]<-'Person'
Find all combination of date and year in a dataframe using:
orgn_data_year_mnth<-unique(orgn_data[c("month", "Year")])
cross join cstmr & organ_data_year_mnth using
cc<-merge(cstmr, organ_data_year_mnth)
Final step would be left join original data on cc
:
final_df<-merge(cc,orgn_data,by=c('Person','month','Year'),all.x=TRUE)
And then replace all NAs with 0 using:
final_df[is.na(final_df)] <- 0
Upvotes: 1
Reputation: 886998
We can use complete
from tidyr
library(tidyr)
library(dplyr)
df1 %>%
complete(Person, month = 1:12, Year, fill = list(Quantity = 0)) %>%
arrange(Person, Year)
# A tibble: 48 x 4
# Person month Year Quantity
# <chr> <int> <int> <dbl>
# 1 A 1 2018 0
# 2 A 2 2018 900
# 3 A 3 2018 0
# 4 A 4 2018 600
# 5 A 5 2018 0
# 6 A 6 2018 0
# 7 A 7 2018 0
# 8 A 8 2018 0
# 9 A 9 2018 300
#10 A 10 2018 0
# … with 38 more rows
###data
df1 <- structure(list(Person = c("A", "A", "A", "A", "B", "B", "B"),
month = c(2L, 4L, 9L, 4L, 6L, 1L, 9L), Year = c(2018L, 2018L,
2018L, 2019L, 2018L, 2019L, 2019L), Quantity = c(900L, 600L,
300L, 40L, 56L, 10L, 20L)), class = "data.frame", row.names = c(NA,
-7L))
Upvotes: 1