Reputation: 83
I am creating a Document Term Matrix with the following code. I have no problem creating the matrix, but when I try to Remove Sparse Terms or find Frequent Terms, I get an error.
text<- c("Since I love to travel, this is what I rely on every time.",
"I got this card for the no international transaction fee",
"I got this card mainly for the flight perks",
"Very good card, easy application process",
"The customer service is outstanding!")
library(tm)
corpus<- Corpus(VectorSource(text))
corpus<- tm_map(corpus, content_transformer(tolower))
corpus<- tm_map(corpus, removePunctuation)
corpus<- tm_map(corpus, removeWords, stopwords("english"))
corpus<- tm_map(corpus, stripWhitespace)
dtm<- as.matrix(DocumentTermMatrix(corpus))
Here is the result:
Docs application card customer easy every ... etc.
1 0 0 0 1 0
2 0 1 0 0 1
3 0 1 0 0 0
4 1 1 0 0 0
5 0 0 1 0 0
Here is where I get the error using either removeSparseTerms or findFreqTerms
sparse<- removeSparseTerms(dtm, 0.80)
freq<- findFreqTerms(dtm, 2)
Result
Error: inherits(x, c("DocumentTermMatrix", "TermDocumentMatrix")) is not TRUE
Upvotes: 0
Views: 2440
Reputation: 23738
removeSparseTerms and findFreqTerms are expecting a DocumentTermMatrix or a TermDocumentMatrix object not a matrix.
Create the DocumentTermMatrix without converting to a matrix and you won't get the error.
dtm <- DocumentTermMatrix(corpus)
sparse <- removeSparseTerms(dtm, 0.80)
freq <- findFreqTerms(dtm, 2)
Upvotes: 5