Reputation: 11
I have been trying out different ways of determining number of topics in LDA (in R) and have used the R package ldatuning
using method Gibbs sampling , but not able to understand the meaning of the different metrics like:
metrics = c("Griffiths2004", "CaoJuan2009", "Arun2010", "Deveaud2014"),
method = "Gibbs",
Can anyone help me understand the different metrics? If anyone can provide some guidance here that would be great. Thanks in Advance.
Upvotes: 1
Views: 1895
Reputation: 452
Have you taken a look at FindTopicsNumber_plot() ? Try:
FindTopicsNumber_plot(your_result_from_tuning)
Each metric will have different pros and cons and may furthermore be more or less applicable given the specifics of your data. The basic overview though is that you want parameters that minimize Arun and CaoJuan, but you would want parameters that maximize Griffiths and Deveaud.
Take a look at this: http://rpubs.com/nikita-moor/107657
Upvotes: 0