user21282980
user21282980

Reputation: 11

Topic Modeling: How to "find thoughts" STM function when topic model was made with a quanteda dfm object?

I am new to topic modeling, so I will do my best to describe the question I have. basically, I am wanting to visually inspect the documents that make up the topics in my topic model. I used the following code from package stm to build a topic model:

stm.35 <- stm(
  dfm, 
  K = 35,
  prevalence = NULL,
  content = NULL,
  data = NULL,
  init.type = c("LDA"),
  seed = 123, 
  emtol = 1e-05,
  verbose = TRUE,
  reportevery = 5,
  LDAbeta = TRUE,
  interactions = TRUE,
  ngroups = 1,
  model = NULL,
  gamma.prior = c("Pooled"),
  sigma.prior = 0,
  kappa.prior = c("L1"),coefficients.
  control = alpha (0.01)
)

The documents are social media comments on Reddit. When I try to use the "findThoughts" function, I get this:

 Topic 1:

I have read that the reason i can't view the topics is because i used a dfm from quanteda to build the stm. That being said, is there a work around for this? or should i try and build my model with an stm object, not a quanteda dfm? I would rather not do this, since I am finding greater success building the dfm with quanteda.

Thank you

Upvotes: 1

Views: 394

Answers (1)

andupaz
andupaz

Reputation: 11

I'm encountering the same issue. I'm using

thoughts1 <- findThoughts(stmM_15_k42,texts = corp_media, n = 2, topics = 6)

But get the error message

Error in findThoughts(stmM_15_k42, texts = corp_media, n = 2, topics = 6) : 

Number of provided texts and number of documents modeled do not match

I imagine the reason for this is that I trimmed and subset the quanteda corpus, even removing duplicates. So the DFM used for fitting the model is not the same as the corpus.

Upvotes: 1

Related Questions