Anne Sofie Nielsen
Anne Sofie Nielsen

Reputation: 11

Quanteda dfm_lookup not working due to column issues

I have a dataframe containing 10000 text observations, and I would like to apply a dictionary on values on it, which contains 10 different categories.

I have run the following code:

my_dict <- dictionary(list(
  category1 = Values1$Security,
  category2 = Values1$Conformity,
  category3 = Values1$Tradition,
  category4 = Values1$Benevolence,
  category5 = Values1$Universalism,
  category6 = Values1$`Self-Direction`,
  category7 = Values1$Stimulation,
  category8 = Values1$Hedonism,
  category9 = Values1$Achievement,
  category10 = Values1$Power
))


corp <- corpus(MessageDA1, text_field = 'Text')

toks <- quanteda::tokens(corp)

dfmt <- dfm(toks)

dfmt_dict <- dfm_lookup(dfmt, dictionary=my_dict)

And then I get the following error message:

Error in `set_dfm_featnames<-`(`*tmp*`, value = col_new) : 
ncol(x) == length(value) is not TRUE

How do I fix this?

Here is the code I used, but on a much smaller sample, this works for me, but on the larger data frame I am using it does not

library(quanteda)

testtext <- c("This is sentence 1.", "This is sentence 2.", "This 
is sentence 3.")

testmy_tokens <- tokens(testtext)

testmy_dict <- dictionary(list(category1 = c("This", "sentence"),
                           category2 = c("is", "sentence"),
                           category3 = c("sentence", "1"),
                           category4 = c("This", "sentence"),
                           category5 = c("is", "sentence"),
                           category6 = c("sentence", "2"),
                           category7 = c("This", "sentence"),
                           category8 = c("is", "sentence"),
                           category9 = c("sentence", "3"),
                           category10 = c("This", "sentence")))

testmy_dfm <- dfm(testmy_tokens)

testmy_dfm <- dfm_lookup(testmy_dfm , dictionary = testmy_dict)

testmy_dfm

Upvotes: 0

Views: 76

Answers (0)

Related Questions