Reputation: 11
I have a dataframe containing 10000 text observations, and I would like to apply a dictionary on values on it, which contains 10 different categories.
I have run the following code:
my_dict <- dictionary(list(
category1 = Values1$Security,
category2 = Values1$Conformity,
category3 = Values1$Tradition,
category4 = Values1$Benevolence,
category5 = Values1$Universalism,
category6 = Values1$`Self-Direction`,
category7 = Values1$Stimulation,
category8 = Values1$Hedonism,
category9 = Values1$Achievement,
category10 = Values1$Power
))
corp <- corpus(MessageDA1, text_field = 'Text')
toks <- quanteda::tokens(corp)
dfmt <- dfm(toks)
dfmt_dict <- dfm_lookup(dfmt, dictionary=my_dict)
And then I get the following error message:
Error in `set_dfm_featnames<-`(`*tmp*`, value = col_new) :
ncol(x) == length(value) is not TRUE
How do I fix this?
Here is the code I used, but on a much smaller sample, this works for me, but on the larger data frame I am using it does not
library(quanteda)
testtext <- c("This is sentence 1.", "This is sentence 2.", "This
is sentence 3.")
testmy_tokens <- tokens(testtext)
testmy_dict <- dictionary(list(category1 = c("This", "sentence"),
category2 = c("is", "sentence"),
category3 = c("sentence", "1"),
category4 = c("This", "sentence"),
category5 = c("is", "sentence"),
category6 = c("sentence", "2"),
category7 = c("This", "sentence"),
category8 = c("is", "sentence"),
category9 = c("sentence", "3"),
category10 = c("This", "sentence")))
testmy_dfm <- dfm(testmy_tokens)
testmy_dfm <- dfm_lookup(testmy_dfm , dictionary = testmy_dict)
testmy_dfm
Upvotes: 0
Views: 76