Kimberly Peter
Kimberly Peter

Reputation: 103

How to subtract a value using two different data frames of different lengths with matching variables

I have two data sets. I would like to match the datasets by variable A and then subtract the value from dataset2 from dataset 1.

df1 <- data.frame(A = c("1", "2","3"),
              B = c(10, 20, 30))
df2 <- data.frame(A = c("1", "1","1","2","2","2","3","3","3"),
              B = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
              C = c(100, 125, 150, 100, 150, 200, 100, 200, 300))

I would like df2 to have an extra column "D" which is the value for df2$C - df1$B matched by column A. ex 100-10 125-10 150-10 100-20 150-20 ...

df2 <- data.frame(A = c("1", "1","1","2", "2","2","3", "3", "3"),
              B = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
              C = c(100, 125, 150, 100, 150, 200, 100, 200, 300),
              D = c(90, 115, 140, 80, 130, 180, 70, 170, 270))

How should I go about creating df2$D?

Upvotes: 1

Views: 121

Answers (2)

akrun
akrun

Reputation: 887048

In base R with match

df2$D <- with(df2, C - df1$B[match(A, df1$A)])

Upvotes: 1

denis
denis

Reputation: 5673

with base R

df1 <- data.frame(A = c("1", "2","3"),
                  B = c(10, 20, 30))
df2 <- data.frame(A = c("1", "1","1","2","2","2","3","3","3"),
                  B = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
                  C = c(100, 125, 150, 100, 150, 200, 100, 200, 300))


df <- merge(df1,df2,by = "A")
df$D <- df$C-df$B.x
df$B <- df$B.y
df[,c("B.x","B.y")] <- NULL

> df
  A   C   D B
1 1 100  90 1
2 1 125 115 2
3 1 150 140 3
4 2 100  80 1
5 2 150 130 2
6 2 200 180 3
7 3 100  70 1
8 3 200 170 2
9 3 300 270 3

with data.table you can update the join directly:

library(data.table)
df1 <- setDT(df1)
df2 <- setDT(df2)
df2[df1,D := C-i.B,on = "A"]

> df2
   A B   C   D
1: 1 1 100  90
2: 1 2 125 115
3: 1 3 150 140
4: 2 1 100  80
5: 2 2 150 130
6: 2 3 200 180
7: 3 1 100  70
8: 3 2 200 170
9: 3 3 300 270

with dplyr:

library(dplyr)

df2 %>%
  merge(df1,by = "A") %>%
  mutate(D = C - B.y,
         B = B.x,
         B.x = NULL,
         B.y = NULL) 

Upvotes: 1

Related Questions