Reputation: 5165
Am I able to remove sparse terms WHILE creating a tm::TermDocumentMatrix
object?
I tried:
TermDocumentMatrix(file.corp, control = list(removeSparseTerms=0.998))
but it does not work.
Upvotes: 1
Views: 757
Reputation: 42293
No, you cannot remove sparse terms like that with the TermDocumentMatrix
function. If you check the help for that function with ?TermDocumentMatrix
you'll see that the options for control
are listed in the help for termFreq
, and when you look at the help for that function with ?termFreq
, you'll see that removeSparseTerms
is not listed there. Although you have bounds
which can do a related job.
If you just want a one-liner that combines TermDocumentMatrix
and removeSparseTerms
, you simply flip your line inside-out and that will work fine:
removeSparseTerms(TermDocumentMatrix(file.corp), 0.998)
I recommend you have a careful look at the documentation for the tm
package, it's one of better examples of a well-documented contributed package. It might save you time waiting for someone to answer your questions here!
Upvotes: 1