Reputation: 1179
I have a text e.g
text<- "i am happy today :):)"
I want to extract :) from text vector and report its frequency
Upvotes: 3
Views: 1301
Reputation: 47551
I assume you only want the count, or do you also want to remove :)
from the string?
For the count you can do:
length(gregexpr(":)",text)[[1]])
which gives 2. A more generalized solution for a vector of strings is:
sapply(gregexpr(":)",text),length)
Josh O'Brien pointed out that this also returns 1 of there is no :)
since gregexpr
returns -1
in that case. To fix this you can use:
sapply(gregexpr(":)",text),function(x)sum(x>0))
Which does become slightly less pretty.
Upvotes: 3
Reputation: 19454
This does the trick but might not be the most direct way:
mytext<- "i am happy today :):)"
# The following line inserts semicolons to split on
myTextSub<-gsub(":)", ";:);", mytext)
# Then split and unlist
myTextSplit <- unlist(strsplit(myTextSub, ";"))
# Then see how many times the smiley turns up
length(grep(":)", myTextSplit))
EDIT
To handle vectors of text with length > 1, don't unlist:
mytext<- rep("i am happy today :):)",2)
myTextSub<-gsub(":\\)", ";:\\);", mytext)
myTextSplit <- strsplit(myTextSub, ";")
sapply(myTextSplit,function(x){
length(grep(":)", x))
})
But I like the other answers better.
Upvotes: 1
Reputation: 162321
Here's one idea, which would be easy to generalize:
text<- c("i was happy yesterday :):)",
"i am happy today :)",
"will i be happy tomorrow?")
(nchar(text) - nchar(gsub(":)", "", text))) / 2
# [1] 2 1 0
Upvotes: 5