Reputation: 20
I have two data frames with the same data set from two different weeks. I want to remove the data in one week for the corresponding item in the same column for the other week. The tricky part is the order changes so the item in row 1 this week may have been row 10 last week. I have a unique field to identify which row, I just can't find anything that'll easily do that. There are 34 columns whose name change every week so hardcoding the columns is not viable either.
product1 = c(a,b,c)
product2 = c(a,c,b)
week11 = c(14,10,11)
week12 = c(12,10,12)
week21 = c(13,10,9)
week22 = c(9,10,10)
lw = data.frame(product1,week11,week12)
cw = data.frame(product2,week21,week22)
Again, I'm trying to subtract to find the variance. In excel, I'd do simple index/match but I'm trying to automate this. Foe some reason, I can't seem to figure it out in R. The output I'd want would be something like this
product Week1 Week2
a -1 -3
b -1 0
c -1 -2
Upvotes: 0
Views: 140
Reputation: 389275
We can use match
to reorder any one of the dataframe based on another and then subtract the two dataframes element by element.
cw <- cw[match(lw$product1, cw$product2), ]
cbind(cw[1], cw[-1] - lw[-1])
# product2 week21 week22
#1 a -1 -3
#3 b -1 0
#2 c -1 -2
In case, there are certain products which are present in one and not in another. We can do a full join, replace NA
's with 0 and then subtract
out <- merge(lw, cw, all = TRUE, by.x = 'product1', by.y = 'product2')
out[is.na(out)] <- 0
out[names(cw)[-1]] - out[names(lw)[-1]]
Upvotes: 1
Reputation: 73702
You could just subtract the columns after ordering.
cw[order(cw$product2), 2:3] - lw[order(lw$product1), 2:3]
# week21 week22
# 1 -1 -3
# 3 -1 0
# 2 -1 -2
To get the expected output, do
setNames(cbind(sort(lw$product1), cw[order(cw$product2), 2:3] - lw[order(lw$product1), 2:3]),
c("product", "week1", "week2"))
# product week1 week2
# 1 a -1 -3
# 3 b -1 0
# 2 c -1 -2
If the product names are not so nicely matching as in the example, you could use intersect
.
i <- sort(intersect(cw2$product2, lw$product1))
cw2[cw2$product2 %in% i, 2:3] - lw[lw$product1 %in% i, 2:3]
# week21 week22
# 1 -1 -3
# 3 -1 0
Data:
lw <- structure(list(product1 = structure(1:3, .Label = c("a", "b",
"c"), class = "factor"), week11 = c(14, 10, 11), week12 = c(12,
10, 12)), class = "data.frame", row.names = c(NA, -3L))
cw <- structure(list(product2 = structure(c(2L, 3L, 1L), .Label = c("a",
"b", "e"), class = "factor"), week21 = c(13, 10, 9), week22 = c(9,
10, 10)), class = "data.frame", row.names = c(NA, -3L))
cw2 <- structure(list(product2 = structure(c(2L, 3L, 1L), .Label = c("a",
"b", "e"), class = "factor"), week21 = c(13, 10, 9), week22 = c(9,
10, 10)), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 0