Reputation: 715
I am new in R as well as tm
package. My taks is to perform text document classification using decision trees. I am following someone's project. At the page no 14 there is a full code. There are 2 types of documents, which I have loaded using DirSource
without any problems. My next step was merging these 2 corpuses into collection
# Merge corpora into one collection
docs <- c( wheat.train , crude.train , wheat.test , crude.test ) ;
And then I would like to make some pre-processing.
#pre-processing
docs.p <- docs
docs.p <- tm_map (docs.p, stripWhitespace)
But I got such error
Error in UseMethod("tm_map", x) :
no applicable method for 'tm_map' applied to an object of class "list"
I understand that this guy is using one of the tm's
previous version, and currently tm_map
takes as an argument a corpus, not a collection of corpuses. My question is how to create such collection of corpuses that it will be possible to perform pre-processing on it?
Upvotes: 0
Views: 1786
Reputation: 534
It worked for me using list
instead of c
and than lapply
.
ex1 <- "bla bla blah "
ex2 <- "dunno what else to say "
wheat <- Corpus(VectorSource(ex1))
crude <- Corpus(VectorSource(ex2))
docs <- list(wheat, crude)
docs.p <- lapply(docs, tm_map, stripWhitespace)
Upvotes: 1