Reputation: 942
I would like to have a df keeping the rows matching 'column'
Observations from df2
df <- data.frame(column=c("Obs1", "Obs2", "Obs3", "Obs4", "Obs5"), S1 = c(1.4,1.9,1.3,2,1), S2= c(0.8,2,1,3,4), S3=c(2.4,2.1,0.5,2,3), S4=c(0.5,0.6,0.9,4,5), S5=c(1.4,1.3,1.6,3,3))
df2<- data.frame(column=c("Obs2", "Obs4"), X = c(1.4,1.9), Y= c(0.8,2))
Something like this:
library(dplyr)
df3<- df %% keep rows which df$column observations are found in df2$column
To a final output:
column S1 S2 S3 S4 S5
1 Obs2 1.9 2.0 2.1 0.6 1.3
2 Obs4 2.0 3.0 2.0 4.0 3.0
Upvotes: 0
Views: 50
Reputation: 5684
Using match
df[!is.na(match(df$column, df2$column)), ]
or grepl
df[which(grepl(paste(c(df2$column), collapse = "|"), df$column)), ]
column S1 S2 S3 S4 S5
2 Obs2 1.9 2 2.1 0.6 1.3
4 Obs4 2.0 3 2.0 4.0 3.0
Upvotes: 0
Reputation: 39595
With base R
:
#Code
df3 <- df[df$column %in% df2$column,]
Output:
column S1 S2 S3 S4 S5
2 Obs2 1.9 2 2.1 0.6 1.3
4 Obs4 2.0 3 2.0 4.0 3.0
Or using subset()
:
#Code2
df3 <- subset(df,column %in% df2$column)
Output:
column S1 S2 S3 S4 S5
2 Obs2 1.9 2 2.1 0.6 1.3
4 Obs4 2.0 3 2.0 4.0 3.0
Upvotes: 1
Reputation: 11584
Using filter:
library(dplyr)
df %>% filter(column %in% df2$column)
column S1 S2 S3 S4 S5
1 Obs2 1.9 2 2.1 0.6 1.3
2 Obs4 2.0 3 2.0 4.0 3.0
Upvotes: 1
Reputation: 145775
This is called a semi-join:
df %>% semi_join(df2, by = "column")
# column S1 S2 S3 S4 S5
# 1 Obs2 1.9 2 2.1 0.6 1.3
# 2 Obs4 2.0 3 2.0 4.0 3.0
In case it's useful, the opposite is called an anti-join (and has an anti_join
function) that keeps only rows that don't match.
Joining is more flexible than filtering because it can work if there are multiple columns to match on.
Upvotes: 1