Reputation: 161
I need to extract whole sentences which begins with a specific word in R. Below is the code which i am trying to use but not able to get the desired result. I am new to regular expression concept in R. I want to extract the sentences which begins with word 'database'.
sent <- c("database connection","connection database fail", "fail connection database","database connection is good")
m <- gregexpr('database.*', sent)
regmatches(sent, m)
Above code gives me the remaining words after word 'database'. But my desired output is:
"database connection", "database connection is good"
Thanks for your help!
Upvotes: 0
Views: 1163
Reputation: 6542
With stringr
sent <- c("database connection","connection database fail", "fail connection database","database connection is good")
stringr::str_subset(sent, "^database.*")
#> [1] "database connection" "database connection is good"
With base R :
sent <- c("database connection","connection database fail", "fail connection database","database connection is good")
grep("^database.*", sent, value = T)
#> [1] "database connection" "database connection is good"
Upvotes: 3
Reputation: 7308
You're not anchoring the regex to the front of the line. If you use the front anchor (^
), you'll get the desired result. Here is what your code should look like:
sent <- c("database connection","connection database fail", "fail connection database","database connection is good")
m <- gregexpr('^database.*', sent)
regmatches(sent, m)
If you want to remove the character(0)
elements from the result you can have the last line be
r <- regmatches(sent, m)
r <- r[lapply(r,length)>0]
Upvotes: 1