Siroros Roongdonsai
Siroros Roongdonsai

Reputation: 31

tm_map is error in R

This is my first time for twitter analytic.

    #Search data from Twitter
library("twitteR")
SearchData = searchTwitter("Bruno Mars", n=1000,lang = 'en')
SearchData

#Scrapping Data 
userTimeline("BrunoMars", n=100, maxID =NULL, excludeReplies = FALSE, includeRts = FALSE)

class(SearchData)
head(SearchData)

#Cleanning Data
library(NLP)
library(tm)



TweetList <- sapply(SearchData, function(x) x$getText()) 

TweetList <- (TweetList[!is.na(TweetList)])
TweetCorpus <- Corpus(VectorSource(TweetList))
TweetCorpus <-  iconv(TweetCorpus, to ="utf-8")

#change data to lower case

TweetCorpus <- tm_map(TweetCorpus,removePunctuation)
TweetCorpus <- tm_map(TweetCorpus, removeNumbers)
TweetCorpus <- tm_map(TweetCorpus, tolower)

I have got this error "Error in UseMethod("tm_map", x) : no applicable method for 'tm_map' applied to an object of class "character" at my last 3 lines.

I have tried to fix the problem by myself by adding content_transformer before removePunctuation, removeNumbers and tolower to my code, but I still have the same error. I really have no idea. I need your suggestions and your advices. I have been fixing this issue for a few day, but it has not been solved yet.

Thanks so much Ros

Upvotes: 0

Views: 1818

Answers (2)

Lorenzo Benassi
Lorenzo Benassi

Reputation: 621

The latest version of tm made it so you can't use functions with tm_map that operate on simple character values any more. So the problem is your tolower step since that isn't a "canonical" transformation (See getTransformations()). Just replace it with

TweetCorpus <- tm_map(TweetCorpus, content_transformer(tolower))

The content_transformer function wrapper will convert everything to the correct data type within the corpus. You can use content_transformer with any function that is intended to manipulate character vectors so that it will work in a tm_map pipeline.

Upvotes: 1

David Robinson
David Robinson

Reputation: 78600

tm_map has to be applied to a Corpus object, not a character vector. But iconv turns your TweetCorpus object from a Corpus back into a character vector.

To fix this, switch the order of your pre-processing, so that you use iconv before you turn the tweets into a Corpus object:

TweetList <- c("hello", "world", "Hooray", "yep")
TweetList <-  iconv(TweetList, to ="utf-8")
TweetCorpus <- Corpus(VectorSource(TweetList))

Upvotes: 0

Related Questions