Reputation: 31
This is my first time for twitter analytic.
#Search data from Twitter
library("twitteR")
SearchData = searchTwitter("Bruno Mars", n=1000,lang = 'en')
SearchData
#Scrapping Data
userTimeline("BrunoMars", n=100, maxID =NULL, excludeReplies = FALSE, includeRts = FALSE)
class(SearchData)
head(SearchData)
#Cleanning Data
library(NLP)
library(tm)
TweetList <- sapply(SearchData, function(x) x$getText())
TweetList <- (TweetList[!is.na(TweetList)])
TweetCorpus <- Corpus(VectorSource(TweetList))
TweetCorpus <- iconv(TweetCorpus, to ="utf-8")
#change data to lower case
TweetCorpus <- tm_map(TweetCorpus,removePunctuation)
TweetCorpus <- tm_map(TweetCorpus, removeNumbers)
TweetCorpus <- tm_map(TweetCorpus, tolower)
I have got this error "Error in UseMethod("tm_map", x) : no applicable method for 'tm_map' applied to an object of class "character" at my last 3 lines.
I have tried to fix the problem by myself by adding content_transformer before removePunctuation, removeNumbers and tolower to my code, but I still have the same error. I really have no idea. I need your suggestions and your advices. I have been fixing this issue for a few day, but it has not been solved yet.
Thanks so much Ros
Upvotes: 0
Views: 1818
Reputation: 621
The latest version of tm
made it so you can't use functions with tm_map
that operate on simple character values any more. So the problem is your tolower
step since that isn't a "canonical" transformation (See getTransformations()
). Just replace it with
TweetCorpus <- tm_map(TweetCorpus, content_transformer(tolower))
The content_transformer
function wrapper will convert everything to the correct data type within the corpus. You can use content_transformer
with any function that is intended to manipulate character vectors so that it will work in a tm_map
pipeline.
Upvotes: 1
Reputation: 78600
tm_map
has to be applied to a Corpus object, not a character vector. But iconv
turns your TweetCorpus
object from a Corpus back into a character vector.
To fix this, switch the order of your pre-processing, so that you use iconv
before you turn the tweets into a Corpus object:
TweetList <- c("hello", "world", "Hooray", "yep")
TweetList <- iconv(TweetList, to ="utf-8")
TweetCorpus <- Corpus(VectorSource(TweetList))
Upvotes: 0