Tricky merge with dates in r - expand rows from one df to match another

Question

So I have two dataframes, in a tidy format:

  df1 <- data.frame(date=as.Date(paste0('2018-12-',c(11,15,18,22,25,29))), balance=c(-500,-250,0,250,-300,500), account='salary')
  df2 <- data.frame(date=as.Date(paste0('2018-12-',c(16,22,27))), balance=c(1000, 700, 250), account='budget')

Now, this is balance of my budget account and my daily account. the sum of the "balance" column from both thee dataframes would give the amount of money I have on any given day.

However, as there is only a row in the dataframe if a transfer that changes the balance is made, it complicates the computation. The merge has to be done, so for every time there is a row in the one dataframe, there has to be row in the other dataframe, that correponds to the balance in that account on that day. So the result in the toy example would be this:

  df.result <- data.frame(date=as.Date(paste0('2018-12-',c(11,15,16,18,22,25,27,29))), balance.salary=c(-500,-250,-250,0,250,-300,-300,500), balance.budget=c(1000,1000,1000,1000,700,700,250,250))

Notice how even though I don't have information for the budget-account from the first date that the salay-account has a row, I'm using the information from the first time there is a row from the budget account.

here I have changed the column names for the balance-variable, so that one row can have the balance for both, but this is not the essential part of the solution, only that the result can be computed like this:

  df.result$balance.total <- df.result$balance.salary + df.result$balance.budget

I have tried using crossing() as per this answer, Copying row from one df into everyone row in another, but isn't useful in this case, as far as I can tell.

Thank you.

PS I prefer a non-tidyverse solution - I'm a big fan of data.table myself - but beggars can't be choosers :)

Ronak Shah · Accepted Answer

A mixture of base R and zoo solution. We first merge both the dataframe by date and fill the missing values using na.locf.

library(zoo)
na.locf(na.locf(merge(df1[-3], df2[-3], all = TRUE, by = "date")), fromLast = TRUE)


#        date balance.x balance.y
#1 2018-12-11      -500      1000
#2 2018-12-15      -250      1000
#3 2018-12-16      -250      1000
#4 2018-12-18         0      1000
#5 2018-12-22       250       700
#6 2018-12-25      -300       700
#7 2018-12-27      -300       250
#8 2018-12-29       500       250

Tricky merge with dates in r - expand rows from one df to match another

Answers (2)

Related Questions