Reputation: 4229
How would I check if a particular word is contained in the text of a field in my dataset using R.
In SQL, we can use the LIKE comparison operator. For example,
SELECT * FROM schools WHERE name LIKE '%Public School%'
If I had to do the same thing in R, how would I do that?
Upvotes: 4
Views: 12379
Reputation: 890
I think the following might answer the question in a simple way.
its a merge of the %in% and %like% function
'%inlike%'<-function(namevec1,namevec2){
temp1<-strsplit(namevec1," ")
temp2<-strsplit(namevec2," ")
ifelse(is.na(charmatch(temp1,temp2)),F,T)
}
namevec1<-c("ffd","ff","hello_world")
namevec2<-c("ffde","ff ","hello_wor")
namevec1 %inlike% namevec2
[1] TRUE TRUE FALSE
namevec2 %inlike% namevec1
[1] FALSE TRUE TRUE
(please note the white space difference)
Upvotes: 0
Reputation: 109874
The qdap
package has a convenience wrapper for agrep
that allows you to search all fields in a data frame or specific fields:
schools <- data.frame(
rank = 1:20,
schools = rep(c("X Public School", "Y Private School"), 10)
)
library(qdap)
Search(schools, "Public School", "schools")
## rank schools
## 1 1 X Public School
## 3 3 X Public School
## 5 5 X Public School
## 7 7 X Public School
## 9 9 X Public School
## 11 11 X Public School
## 13 13 X Public School
## 15 15 X Public School
## 17 17 X Public School
## 19 19 X Public School
Upvotes: 0
Reputation: 307
In Base R one can use %in% to subset data e.g. dataframe[dataframe$variable %in% dataframe2$variable2]
Upvotes: 0
Reputation: 269664
Given
schools <- data.frame(rank = 1:20,
name = rep(c("X Public School", "Y Private School"), 10))
try this:
subset(schools, grepl("Public School", name))
or this:
schools[ grep("Public School", schools$name), ]
or this:
library(sqldf)
sqldf("SELECT * FROM schools WHERE name LIKE '%Public School%'")
or this:
library(data.table)
data.table(schools)[ grep("Public School", name) ]
or this:
library(dplyr)
schools %>% filter(grepl("Public School", name))
Upvotes: 10