Reputation: 9
I have 2 files of 3 columns and hundreds of rows. I want to compare and list the common elements of first two columns of the two files. Then the list which i will get after comparing i have to add the third column of second file to that list. Third column will contain the values which were in the second file corresponding to numbers of remaining two columns which i have got as common to both the files. For example, consider two files of 6 rows and 3 columns First file -
1 2 3
2 3 4
4 6 7
3 8 9
11 10 5
19 6 14
second file -
1 4 1
2 1 4
4 6 10
3 7 2
11 10 3
19 6 5
As i said i have to compare the first two columns and then add the third column of second file to that list. Therefore, output must be:
4 6 10
11 10 3
19 6 5
I have the following code, however its showing an error object not found also i am not able to add the third column. Please help :) df2 = reading first file, df3 = reading second file. Code is in R language.
s1 = 1
for(i in 1:nrow(df2)){
for(j in 1:nrow(df3)){
if(df2[i,1] == df3[j,1]){
if(df2[i,2] == df3[j,2]){
common.rows1[s1,1] <- df2[i,1]
common.rows1[s1,2] <- df2[i,2]
s1 = s1 + 1
}
}
}
Upvotes: 0
Views: 479
Reputation: 92282
data.table
proposal
library(data.table)
setDT(df1)
setDT(df2)
setkey(df1, V1, V2)
setkey(df2, V1, V2)
df2[df1[, -3, with = F], nomatch = 0]
## V1 V2 V3
## 1: 4 6 10
## 2: 11 10 3
## 3: 19 6 5
Upvotes: 1
Reputation: 2747
It seems that this is the perfect use-case for merge, e.g.
merge(d1[c('V1','V2')],d2)
results in:
V1 V2 V3
1 11 10 3
2 19 6 5
3 4 6 10
In which 'V1' and 'V2' are the column names of interest.
Upvotes: 1
Reputation: 70266
You can use the %in%
operator twice to subset your second data.frame (I call it df2
):
df2[df2$V1 %in% df1$V1 & df2$V2 %in% df1$V2,]
# V1 V2 V3
#3 4 6 10
#5 11 10 3
#6 19 6 5
V1
and V2
in my example are the column names of df1
and df2
.
Upvotes: 1
Reputation: 206197
If your two tables are d1
and d2
,
d1<-data.frame(
V1 = c(1, 2, 4, 3, 11, 19),
V2 = c(2, 3, 6, 8, 10, 6),
V3 = c(3, 4, 7, 9, 5, 14)
)
d2<-data.frame(
V1 = c(1, 2, 4, 3, 11, 19),
V2 = c(4, 1, 6, 7, 10, 6),
V3 = c(1, 4, 10, 2, 3, 5)
)
then you can subset d2
(in order to keep the third column) with
d2[interaction(d2$V1, d2$V2) %in% interaction(d1$V1, d1$V2),]
The interaction()
treats the first two columns as a combined key.
Upvotes: 0