Reputation: 566
I have two dataframes, each with the same columns. Some columns have the same values in the same order in both dataframes (X1
, X2
below). Other columns have the same values, but in a different order (Y1
). This is only a problem for some levels of first variables (here, the order of rows in Y1 differs for X1 == "a"
, but not X1 == "b"
). Example:
df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
"X2" = c("1", "2", "3", "1", "2", "3"),
"Y1" = c("d", "d", "f", "g", "h", "i"))
df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
"X2" = c("1", "2", "3", "1", "2", "3"),
"Y1" = c("f", "d", "d", "g", "h", "i"))
I would like to change the values of df2$X1
and df2$X2
such that the two dataframes are matched on values of Y1
.
I would like to change X1
and X2
rather than Y1
because there are many Y
variables. I would like to do this only for df$X1 == "a"
.
The output should looks like this:
df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
"X2" = c("3", "1", "2", "1", "2", "3"),
"Y1" = c("f", "d", "d", "g", "h", "i"))
Upvotes: 1
Views: 972
Reputation: 7327
What is a little tricky in your situation is that you have duplicates in the Y1
columns which correspond to different values in the X2
columns. So you will have to make these unique.
First, make sure that your Y1
columns are character vectors and not factors:
df1 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
"X2" = c("1", "2", "3", "1", "2", "3"),
"Y1" = c("d", "d", "f", "g", "h", "i"),
stringsAsFactors = F)
df2 <- data.frame("X1" = c("a", "a", "a", "b", "b", "b"),
"X2" = c("1", "2", "3", "1", "2", "3"),
"Y1" = c("f", "d", "d", "g", "h", "i"),
stringsAsFactors = F)
Give unique names to your Y1
duplicates:
df1$Y1uniq <- make.unique(df1$Y1)
df2$Y1uniq <- make.unique(df2$Y1)
Then you can use match()
using those uniques values (and remove that column once you don't need it anymore):
df1[match(df2$Y1uniq, df1$Y1uniq), ][ , 1:3]
Output:
X1 X2 Y1
3 a 3 f
1 a 1 d
2 a 2 d
4 b 1 g
5 b 2 h
6 b 3 i
Upvotes: 1