Reputation: 23
I would to use Benoit's R-package quanteda to analyze articles exported from lexisnexis. The export is in the standard html-format. I use the tm package + plugin to read the lexisnexis output. Unfortunately, an error occurs when transforming the tm-corpus to quanteda-corpus. Is that function broken, or is there something I get wrong before?
library("tm")
library("tm.plugin.lexisnexis")
library("quanteda")
ln <- LexisNexisSource("lexisnexisOutput.html")
cr <- Corpus(ln)
crp <- corpus(cr)
Error in UseMethod("corpus") :
no applicable method for 'corpus' applied to an object of class "list"
In addition: Warning message:
In corpus(texts, docvars = metad, source = paste("Converted from tm VCorpus '", :
Arguments docvarssource not used.
Upvotes: 2
Views: 648
Reputation: 14902
This was a limitation of corpus.VCorpus()
when the texts were a vector of char types, not just a single char type. Fixed in quanteda 0.9.1-6. See Issue #80 on GitHub.
Upvotes: 1