lqdo2000
lqdo2000

Reputation: 391

R grep search patterns in multiple columns

I have a data frame like as follows:

Col1    Col2    Col3
A       B       C
D       E       F
G       H       I

I am trying to keep lines matching 'B' in 'Col2' OR F in 'Col3', in order to get:

Col1    Col2    Col3
A       B       C
D       E       F

I tried:

data[(grep("B",data$Col2) || grep("F",data$Col3)), ]

but it returns the entire data frame.

NOTE: it works when calling the 2 grep one at a time.

Upvotes: 2

Views: 18247

Answers (3)

akrun
akrun

Reputation: 887911

Or using a single grepl after pasteing the columns

df1[with(df1, grepl("B|F", paste(Col2, Col3))),]
#  Col1 Col2 Col3
#1    A    B    C
#2    D    E    F

Upvotes: 6

Jaguar
Jaguar

Reputation: 204

The data.table package makes this type of operation trivial due to its compact and readable syntax. Here is how you would perform the above using data.table:

> df1 <- structure(list(Col1 = c("A", "D", "G"), Col2 = c("B", "E", "H"
+ ), Col3 = c("C", "F", "I")), .Names = c("Col1", "Col2", "Col3"
+ ), row.names = c(NA, -3L), class = "data.frame")

> library(data.table)
> DT <- data.table(df1)
> DT
   Col1 Col2 Col3
1:    A    B    C
2:    D    E    F
3:    G    H    I

> DT[Col2 == 'B' | Col3 == 'F']
   Col1 Col2 Col3
1:    A    B    C
2:    D    E    F
> 

data.table performs its matching operations with with=TRUE by default. Note that the matching is much faster if you set keys on the data but that is for another topic.

Upvotes: 0

Sathish
Sathish

Reputation: 12723

with(df1, df1[ Col2 == 'B' | Col3 == 'F',])
#   Col1 Col2 Col3
# 1    A    B    C
# 2    D    E    F

Using grepl

with(df1, df1[ grepl( 'B', Col2) | grepl( 'F', Col3), ])
#   Col1 Col2 Col3
# 1    A    B    C
# 2    D    E    F

Data:

df1 <- structure(list(Col1 = c("A", "D", "G"), Col2 = c("B", "E", "H"
), Col3 = c("C", "F", "I")), .Names = c("Col1", "Col2", "Col3"
), row.names = c(NA, -3L), class = "data.frame")

Upvotes: 4

Related Questions