Reputation: 11
I have made some headway in getting koRpus to analyze my data, but there are lingering problems.
The 'tokenize' command seems to work--kind of. I run the following line of code:
word <- tokenize("/Users/gdballingrud/Desktop/WPSCASES 1/", lang="en")
And it produces a 'Large krp.text' object. However, the size of the file (5.6 MB) is far less than the size of the file I reference in the code (260 MB). Further, when I use the 'readability' command to generate text analysis scores (like so:)
all <- readability(word)
It returns one readability score for the whole krp.text object (one per readability measure, I mean).
I need readability scores on each Word file I have in my folder, and I need to use koRpus (others like quanteda don't generate some of the readability measures that I need, like LIX and kuntzsch's text-redundandz-index).
Is anyone experienced enough with koRpus to point out what I have done wrong? The recurring problems are: 1) getting the tokenize command to recognize each file in my folder, and 2) getting readability scores for each separate file.
Thanks, Gordon
Upvotes: 1
Views: 38