Edy Ashton
Edy Ashton

Reputation: 45

How to remove rows from one DataFrame based on rows from another DataFrame?

I have two dataframes. I would like to match the rownames of the two dataframes, and if the rownames won't match, I would like to remove theses rows from both dataframe to match the dimension of rows.

Dataframe A:

5002T  0 1
B01167 1 0
C1329  1 0

dim(3, 3)

Dataframe B:

5002T  4
B41167 3 
C1329  8 
2C134  1 

dim(4, 2)

The Output will be look like the following:

Dataframe A_New:

5002T  0 1
C1329  1 0

dim(2, 3)

Dataframe B_New:

5002T  4 
C1329  8 

dim(2, 2)

Here is the code which I have tried to match the rownames:

match <- rownames(A) %in% rownames(B)
sum(!match)

How can I remove the unmatched rows from both dataframes and get the same dimension.

Upvotes: 2

Views: 303

Answers (1)

jay.sf
jay.sf

Reputation: 72593

You were close.

dfa[rownames(dfa) %in% rownames(dfb), ]
#        V2 V3
# X5002T  0  1
# C1329   1  0

dfb[rownames(dfb) %in% rownames(dfa), , drop=FALSE]
#        V2
# X5002T  4
# C1329   8

However, it might be easier using intersect.

mtch <- intersect(rownames(dfa), rownames(dfb))

dfa[mtch, ]
#        V2 V3
# X5002T  0  1
# C1329   1  0

dfb[mtch, , drop=FALSE]
#        V2
# X5002T  4
# C1329   8

Note: drop=FALSE is needed because in this special case R would coerce a one-columned data frame in a vector.


Data:

dfa <- structure(list(V2 = c(0L, 1L, 1L), V3 = c(1L, 0L, 0L)), row.names = c("X5002T", 
"B01167", "C1329"), class = "data.frame")

dfb <- structure(list(V2 = c(4L, 3L, 8L, 1L)), row.names = c("X5002T", 
"B41167", "C1329", "X2C134"), class = "data.frame")

Upvotes: 2

Related Questions