Chedi Bechikh
Chedi Bechikh

Reputation: 173

supervised keyphrase extraction weka or other tool

How to use WEKA to find keyphrases with supervised méthod.

i have to learn model for keyphrase extraction, so i have a corpus for training (for every document a correspending file that contain keyphrases or keywords)

Also i have a corpus for test the supervised model (docuement without keyphrases file), so the model should output a list of keyphrases for every document.

My question is how to input the document into weka, should i add for every document

@attribute doc string

@data "Docu1............" "Docu2............" ... .. "DocuN............"

Now how to input the files that contain th keyphrases for every document to learn from the model?

Upvotes: 0

Views: 132

Answers (1)

Istvan Nagy
Istvan Nagy

Reputation: 310

First you need choose what features want to use: the most basic algorithm only based on the tf-idf values. https://code.google.com/p/kea-algorithm/ But you can extends this features your "task-specific" feautres too. For example the first occurance of the phrase etc. You can find some possible features in this article: http://www.aclweb.org/anthology/S/S10/S10-1040.pdf Than, you have to choose a machine learning algorithm and train it you train data set, and evaluate it on your test set.

Upvotes: 1

Related Questions