Sylababa
Sylababa

Reputation: 65

LDAtuning Package

I try to find the optimal number of topics in the LDA algorithm for my database. For this purpose I try to use the package "ldatuning". After the implementation of the LDA algorithm with the "gibbs" method I try to use the function:

Griffiths2004(models, control) The arguments should be: models An object of class "LDA control A named list of the control parameters for estimation or an object of class "LDAcontrol".

I used it like that:

Griffiths2004(lda_5, lda_5@control)

R just says after this: Fehler in as.list.default(X) : No method to change the S4 Klasse in a vector

Can someone explain me what I need to change so that I can use this function? I don´t get it.

Kind Regards Tom

Upvotes: 0

Views: 573

Answers (1)

yuki
yuki

Reputation: 775

The problem probably lies in how you pass the control parameter list to the Griffiths2004 function.

In the Griffiths2004 function, the parameters are addressed as in a list using control$param. However, lda_5@control returns an S4 object where the parameters should be addressed with control@param. (An S4 object is an advanced class in R, but the only important difference for this application is, that we address objects in these lists with @ instead of $)

You can see that lda@control is an S4 object when calling it:

> lda@control

An object of class "LDA_Gibbscontrol"
Slot "delta":
[1] 0.1

Slot "iter":
[1] 2100

Slot "thin":
[1] 2000

Slot "burnin":
[1] 100
...

It possesses so-called slots instead of simple list names.

You can avoid the issue you encountered by passing the control parameters to your LDA model as a list.

Here is an example:


library('Rmpfr')
library("topicmodels")
library("ldatuning")

# create LDA model with Gibbs sampling
data("AssociatedPress", package="topicmodels")
dtm <- AssociatedPress[1:10, ]

# pass controls as a list (otherwise it does not work!)
controls = list(burnin = 100, iter = 2000, keep = 500, alpha = 0.1)

# get model and pass controls list
lda <- LDA(AssociatedPress[1:20,], k = 2, method = "Gibbs", 
           control = controls)

# Griffith score:
Griffiths2004(list(lda), controls)
# Output:
[1] -30355.28

It is also important that you pass the lda model in a list, since the Griffiths2004 function iterates over a list of models.

Also make sure you have installed all additional dependencies for the Rmpfr package that's used in the ldatuning package.

Upvotes: 1

Related Questions