Reputation: 942

Keep rows with same column characters as other df in dplyr

I would like to have a df keeping the rows matching 'column' Observations from df2

df <- data.frame(column=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), S1 = c(1.4,1.9,1.3,2,1), S2= c(0.8,2,1,3,4), S3=c(2.4,2.1,0.5,2,3), S4=c(0.5,0.6,0.9,4,5), S5=c(1.4,1.3,1.6,3,3))

df2<- data.frame(column=c("Obs2", "Obs4"), X = c(1.4,1.9), Y= c(0.8,2))

Something like this:

library(dplyr)
df3<- df %% keep rows which df$column observations are found in df2$column

To a final output:

  column S1  S2  S3  S4  S5
1 Obs2   1.9 2.0 2.1 0.6 1.3
2 Obs4   2.0 3.0 2.0 4.0 3.0

Upvotes: 0

Answers (5)

akrun

Reputation: 887108

An option with base R

df[with(df, column %in% df2$column),]

Upvotes: 1

Agaz Wani

Reputation: 5684

Using match

df[!is.na(match(df$column, df2$column)), ]

or grepl

df[which(grepl(paste(c(df2$column), collapse = "|"), df$column)), ]

 column  S1 S2  S3  S4  S5
2   Obs2 1.9  2 2.1 0.6 1.3
4   Obs4 2.0  3 2.0 4.0 3.0

Upvotes: 0

Duck

Reputation: 39595

With base R:

#Code
df3 <- df[df$column %in% df2$column,]

Output:

  column  S1 S2  S3  S4  S5
2   Obs2 1.9  2 2.1 0.6 1.3
4   Obs4 2.0  3 2.0 4.0 3.0

Or using subset():

#Code2
df3 <- subset(df,column %in% df2$column)

Output:

  column  S1 S2  S3  S4  S5
2   Obs2 1.9  2 2.1 0.6 1.3
4   Obs4 2.0  3 2.0 4.0 3.0

Upvotes: 1

Karthik S

Reputation: 11584

Using filter:

library(dplyr)
df %>% filter(column %in% df2$column)
  column  S1 S2  S3  S4  S5
1   Obs2 1.9  2 2.1 0.6 1.3
2   Obs4 2.0  3 2.0 4.0 3.0

Upvotes: 1

Gregor Thomas

Reputation: 145775

This is called a semi-join:

df %>% semi_join(df2, by = "column")
#   column  S1 S2  S3  S4  S5
# 1   Obs2 1.9  2 2.1 0.6 1.3
# 2   Obs4 2.0  3 2.0 4.0 3.0

In case it's useful, the opposite is called an anti-join (and has an anti_join function) that keeps only rows that don't match.

Joining is more flexible than filtering because it can work if there are multiple columns to match on.

Upvotes: 1

Keep rows with same column characters as other df in dplyr

Answers (5)

Related Questions