Chetan Arvind Patil
Chetan Arvind Patil

Reputation: 866

Compare Two Data Frame And Keep Columns That Match For Specific Row

I have two data frames that have a row which are unique keys and are present in both data set. I want to keep only those columns in both where specific row value matches.

For below data Data_1 has first row as 1,2,3,4, however Data_2 only has 1,2,3. I want compare or subset such that Data_1 only has row first as 1,2,3, as those keys are only present in both data frames.

Goal is not to merge, but to subset based on comparison of two different data frames.

Any suggestions on how to get this done?

Data_1

read.table(text = "A    B    C    D
       1    2    3    4
       1.1  1.2  1.3  1.4
       1.1  1.2  1.3  1.4
       1.1  1.2  1.3  1.4
       1.1  1.2  1.3  1.4
       1.1  1.2  1.3  1.4", header = T)

Data_2

read.table(text = "A1    B1    C1  
       1    2    3   
       1.0  2.2  4.3 
       1.0  2.2  4.3 
       1.0  2.2  4.3 
       1.0  2.2  4.3 
       1.0  2.2  4.3 ", header = T)

Output

Only Data_1 changes as it has one extra row that isn't in Data_2 based on row 2 values

read.table(text = "A    B    C
       1    2    3 
       1.1  1.2  1.3
       1.1  1.2  1.3
       1.1  1.2  1.3
       1.1  1.2  1.3
       1.1  1.2  1.3", header = T)

Upvotes: 1

Views: 724

Answers (1)

glaucon
glaucon

Reputation: 238

Your question is somewhat unclear. However, if you are simply trying to subset data_1 based on one row from data_2 the code below will work. In other words, the code below produces your desired "output" data frame.

data_1 <- read.table(text = "A    B    C    D
                             1    2    3    4
                             1.1  1.2  1.3  1.4
                             1.1  1.2  1.3  1.4
                             1.1  1.2  1.3  1.4
                             1.1  1.2  1.3  1.4
                             1.1  1.2  1.3  1.4", header = T)

data_2 <- read.table(text = "A1    B1    C1  
                             1    2    3   
                             1.0  2.2  4.3 
                             1.0  2.2  4.3 
                             1.0  2.2  4.3 
                             1.0  2.2  4.3 
                             1.0  2.2  4.3 ", header = T)

output <- data_1[data_1[1,] %in% data_2[1,]]

    A   B   C
1 1.0 2.0 3.0
2 1.1 1.2 1.3
3 1.1 1.2 1.3
4 1.1 1.2 1.3
5 1.1 1.2 1.3
6 1.1 1.2 1.3

Upvotes: 1

Related Questions