Shivani Rao
Shivani Rao

Reputation: 41

R text mining package updating the corpus by modifying or deleting existing documents

I would like to modify an existing document indexed by a corpus by doing something simple like this

myCorpus[[10]] = "hey I am the new content of this document"

Is this valid?

Upvotes: 1

Views: 1863

Answers (1)

agstudy
agstudy

Reputation: 121568

It is not clear what do you want to do with your corpus. append your Corpus or modify the 10th element?

I want to say that as a syntax it is correct but as semantic is false.

Conceptually a corpus is a metadata and a list of TextDocument. So, You can access this list as any R list with '[[' or with '$'.

So if you do ( It is better to use <- than = even is here they are equivalent)

myCorpus[[10]] <- "hey I am the new content of this document" 

This will create or change the 10th element , but with an element of class character not a TextDocument. So you can't apply use methods on class

So To update the content of 10 text document:

Content(myCorpus[[10]]) <- "hey I am the new content of this document" 

To create new elements use :

tmUpdate(ovid, DirSource(txt))

The source is checked for new files which do not already exist in the document collection. are parsed and added to the existing document collection.

Upvotes: 3

Related Questions