Reputation: 940
So I just built this function which basically takes two strings (a text, and a set of keywords). Then it has to find how many keywords are contained by the text string, if any. I been trying to apply the code on a data frame with no success.
Function is working:
something=function(text,keywords){
kw = unlist(strsplit(keywords, ","))
c=0
for(i in length(kw)){
if(grepl(kw[i],text)==0){
c=c+1
} else {c}
}
return(c)
}
Where if I imput:
> something("this planetarium is the shit","planetarium,amazing")
[1] 1
But what if my data frame was df
keyword text_clean
1 planetarium Man this planetarium is the shit
2 musee,africain rt lyonmangels reste encore places franceangels tour lyon organisons investissons pme
My output expected is:
df.1
1 1
2 0
Any insight? I was trying this code:
substng<-function(text, keywords){
vector = laply(text,function(text,keywords){
kw = unlist(strsplit(keywords, ","))
c=0
for(i in length(kw)){
if(grepl(kw[i],text)==0){
c=c+1
} else {c}
}
return(c)
})
vector.df= as.data.frame(vector)
}
df <- read.table(header = TRUE, stringsAsFactors = FALSE, text = "keyword text_clean
planetarium 'Man this planetarium is the shit'
musee,africain 'rt lyonmangels reste encore places franceangels tour lyon organisons investissons pme'")
df$count = substng(df$text_clean,df$keyword)
Upvotes: 0
Views: 130
Reputation: 66
I think stri_count in the stringi package can accomplish this.
Use "pattern|amazing" as the pattern/regex. Pipe = "or".
https://cran.r-project.org/web/packages/stringi/stringi.pdf
Upvotes: 0