Substracting Two Dataframes - Retaining first Column containing names as characters

Question

I haven't found an exact answer for what I'm trying to do. I often have two dataframes to substract, each ones containing a "names" column. Example:

df1 <- data.frame(name = c("name1","name2","name3","name4"),
                  month1 = c(5,6,7,8),
                  month2 = c(10,11,12,13),
                  month3 = c(15,16,17,18))

df2 <- data.frame(name = c("name1","name2","name3","name4"),
                  month1 = c(22,23,24,25),
                  month2 = c(31,34,35,39),
                  month3 = c(42,43,45,46))

What I would very simply like to do is have a df3, that is a substraction of df2 - df1, but retains the name columns:


df3 <- df1 %>%
  select("name")

temp <- df2[,-c(1)] - df1[,-c(1)]

df3 <- bind_cols(df3,temp) 

print(df3)

   name month1 month2 month3
1 name1     17     21     27
2 name2     17     23     27
3 name3     17     23     28
4 name4     17     26     28

Now, it's only three short lines of code. However, is there no "one liner" function that can substract the dataframes while specifying the retention of the "name" column. It would essentially do the same as df2[,-c(1)] - df1[,-c(1)], but immediately re-add the "name" column, rather than splitting the dataframe. Is that possible?

Till · Accepted Answer

Your solution is already close to a one liner. By writing it differently you can make it a one-liner like this:

bind_cols(name = df1$name, df2[,-1] - df1[,-1])

But I don't think that is an actual improvement, as you are losing some of the readability your original solution has.

You are saying that you do this frequently. It might be a good idea to write a function for this yourself that you can then re-use.

subtract_dfs <- 
  function(df1, df2, name = "name") {
    bind_cols(name = df1[name], df2[,-1] - df1[,-1])                  
  }

Now you can do:

subtract_dfs(df1, df2)

This allows for the name variable to have custom values. The function could be further improved. For example it could be extended to give correct results even if not all values for name are present in both data frames.

Substracting Two Dataframes - Retaining first Column containing names as characters

Answers (2)

Related Questions