deadpixels
deadpixels

Reputation: 809

Mallet topic modelling, labelling topics

I have a corpus of articles in a single document and I am applying the topic modelling algorithm from MALLET in order to later use a search function that will allow the user to search for relevant articles to his input. The algorithm I'm using is the topic modelling API developers guide found here.

I am new to topic modelling, but as far as I understand it generates a user-specified number of topics that hold words relevant to this topic, but the program does not know what the topic is. This has to be specified manually by the user, am I right?

My question is, how do I manually set these topic names so I can use them later? i.e. a topic output from the algorithm will be:

0 bush republican usa immigration mexico control conservatives

where 0 is the name of the topic. What I want is to manually change the name to something like:

Immigration Policy: bush republican usa immigration mexico control conservatives

Any help please?

Upvotes: 1

Views: 581

Answers (1)

Sir Cornflakes
Sir Cornflakes

Reputation: 665

I suggest that you keep a separate file with topic number and manually assigned labels, e.g., in the format

0 Immigration_Policy

Then you can relate the topic numbers in all output files from Mallet to the labels.

Upvotes: 3

Related Questions