Reputation: 526
I haven't found an exact answer for what I'm trying to do. I often have two dataframes to substract, each ones containing a "names" column. Example:
df1 <- data.frame(name = c("name1","name2","name3","name4"),
month1 = c(5,6,7,8),
month2 = c(10,11,12,13),
month3 = c(15,16,17,18))
df2 <- data.frame(name = c("name1","name2","name3","name4"),
month1 = c(22,23,24,25),
month2 = c(31,34,35,39),
month3 = c(42,43,45,46))
What I would very simply like to do is have a df3, that is a substraction of df2 - df1, but retains the name columns:
df3 <- df1 %>%
select("name")
temp <- df2[,-c(1)] - df1[,-c(1)]
df3 <- bind_cols(df3,temp)
print(df3)
name month1 month2 month3
1 name1 17 21 27
2 name2 17 23 27
3 name3 17 23 28
4 name4 17 26 28
Now, it's only three short lines of code. However, is there no "one liner" function that can substract the dataframes while specifying the retention of the "name" column. It would essentially do the same as df2[,-c(1)] - df1[,-c(1)], but immediately re-add the "name" column, rather than splitting the dataframe. Is that possible?
Upvotes: 0
Views: 47
Reputation: 9878
You can use dplyr and purrr:
library(dplyr)
library(purrr)
map2_dfc(df2[-1], df1[-1], ~ .x - .y) %>% cbind(df2[1], .)
name month1 month2 month3
1 name1 17 21 27
2 name2 17 23 27
3 name3 17 23 28
4 name4 17 26 28
You can also wrap that inside a custom function:
subtract_dfs<-function(df_1, df_2){
purrr::map2_dfc(df_1[-1], df_2[-1], ~ .x - .y) %>% cbind(df_1[1], .)
}
EDIT
There is not need for a mapping function here, as the data frames can be subtracted all at once:
cbind(df2[1], df2[-1] - df1[-1])
Upvotes: 1
Reputation: 6663
Your solution is already close to a one liner. By writing it differently you can make it a one-liner like this:
bind_cols(name = df1$name, df2[,-1] - df1[,-1])
But I don't think that is an actual improvement, as you are losing some of the readability your original solution has.
You are saying that you do this frequently. It might be a good idea to write a function for this yourself that you can then re-use.
subtract_dfs <-
function(df1, df2, name = "name") {
bind_cols(name = df1[name], df2[,-1] - df1[,-1])
}
Now you can do:
subtract_dfs(df1, df2)
This allows for the name variable to have custom values. The function could be further improved. For example it could be extended to give correct results even if not all values for name
are present in both data frames.
Upvotes: 2