Reputation: 1945
I have a data frame concerning the purchases of a shop owner. They don't happen on a daily basis. It has two columns: the first one describes the date, the second one the quantity bought in that date.
I would like to transform it into daily data, completing the original dataset, so I created a sequence:
a <- seq(as.Date("2013/11/19"), as.Date("2017/04/22"), "days")
The first date corresponds to the one of the first purchase, and the second one of the last one, of the original dataset.
The classes are both "Date".
How can I merge the two dataset by "date", even if, obviously, they have different rows length? I would like to have a dataframe with daily "Date" as first column, and "Quantity" on the second one, with zeros where purchases didn't happen.
Upvotes: 1
Views: 4829
Reputation: 2644
Using base R:
# create sample data frame with sales data
test <- data.frame(date = as.Date(c("2017/08/12", "2017/08/15", "2017/09/02")), quantity = c(3,2,1))
# create the date range
dates <- data.frame(date = seq(min(test$date), max(test$date), by = "day"))
# perform the left join
# (keeping all rows from "dates", and joining the sales dataset to them)
result <- merge(dates, test, by.y = "date", by.x = "date", all.x = TRUE)
In the merge function, by.y and by.x are the columns used to join the dataset, while all.x
tells you, that all rows from x
(in this case dates
) should be kept in the resulting data frame.
Upvotes: 3