djacobs1216
djacobs1216

Reputation: 83

R Error: inherits(x, c("DocumentTermMatrix", "TermDocumentMatrix")) is not TRUE

I am creating a Document Term Matrix with the following code. I have no problem creating the matrix, but when I try to Remove Sparse Terms or find Frequent Terms, I get an error.

text<- c("Since I love to travel, this is what I rely on every time.", 
         "I got this card for the no international transaction fee", 
         "I got this card mainly for the flight perks",
         "Very good card, easy application process",
         "The customer service is outstanding!") 

library(tm)
corpus<- Corpus(VectorSource(text))
corpus<- tm_map(corpus, content_transformer(tolower))
corpus<- tm_map(corpus, removePunctuation)
corpus<- tm_map(corpus, removeWords, stopwords("english"))
corpus<- tm_map(corpus, stripWhitespace)

dtm<- as.matrix(DocumentTermMatrix(corpus))

Here is the result:

Docs    application card    customer    easy    every ... etc.
1       0           0       0           1       0
2       0           1       0           0       1
3       0           1       0           0       0
4       1           1       0           0       0
5       0           0       1           0       0

Here is where I get the error using either removeSparseTerms or findFreqTerms

sparse<- removeSparseTerms(dtm, 0.80)
freq<- findFreqTerms(dtm, 2)

Result

Error: inherits(x, c("DocumentTermMatrix", "TermDocumentMatrix")) is not TRUE

Upvotes: 0

Views: 2440

Answers (1)

CodeMonkey
CodeMonkey

Reputation: 23738

removeSparseTerms and findFreqTerms are expecting a DocumentTermMatrix or a TermDocumentMatrix object not a matrix.

Create the DocumentTermMatrix without converting to a matrix and you won't get the error.

dtm <- DocumentTermMatrix(corpus)
sparse <- removeSparseTerms(dtm, 0.80)
freq <- findFreqTerms(dtm, 2)

Upvotes: 5

Related Questions