How to add character vector as metadata/docvars to a dfm for stm prevalance

Question

I want to add the character vector EU_CFSP_INT_all <- c(...) as metadata to my dfm, so that I can further when performing an stm, set the prevalance to EU_CFSP_INT_all. The character vector includes 62 expressions and my corpus/dfm consists of 201 documents. It might sound trivial, but how do I manage to include EU_CFSP_INT_all as a column in the dfm, in which the 62 expressions are featured on every row (201) of the dfm?

The closest I have gotten was by using the following code:

EU_CFSP_INT_all_EV <- rep_len(EU_CFSP_INT_all, length.out = 201)

dfmat_PRs_trim_c$EUint <- EU_CFSP_INT_all_EV

However, it just looped the singularly the 62 expressions until 201 were reached. Accordingly, only one, instead of all 62 were matched with each document in the dfm.

Also converting the vector to a tokens object got me closer to the goal with the tokens object consisting of 201 documents each with the length of 62:

EU_CFSP_INT_all_vector <- rep(list(EU_CFSP_INT_all), 201)

EU_CFSP_vector_toks <- tokens(EU_CFSP_INT_all_vector)

summary(EU_CFSP_vector_toks)

But when I then continued to create another dfm to merge, the values got scrambled. I feel like there must be a very easy way to do this which I am unaware of. Thanks a lot if anyone can help me out!

How to add character vector as metadata/docvars to a dfm for stm prevalance

Answers (1)

Related Questions