Reputation: 75
I am working on a project which requires applying the topic model LDA. Because each document in my case is short, I have to use Labelled LDA. I do not have much knowledge in this area, and all I need to do is to apply the LLDA to my data.
After searching on web I found an LLDA implementation on Stanford TMT. What I understand from section Training a Labeled LDA model is: I should label each input document before training. Am I misunderstanding something?
If my understanding is correct, this will involves too much work on labeling documents. Instead, can I provide a separate list of topics, and train the documents without labels?
Upvotes: 4
Views: 3888
Reputation: 11941
Your understanding is correct: you need to label each input document before training.
Labelled LDA is a supervised method, meaning that you need a labelled dataset.
If you "have to use Labelled LDA" you cannot get away from the need to obtained a labelled dataset.
If the LabeledLDA
model in TMT needs a LabeledLDADocumentParams
object and to crete it you need array of lablels. So, no it is not possible to train a Labeled LDA model without labels.
Upvotes: 5