Weka StringToWordVector Filter - Implementation in Java

Question

I started trying out the Weka GUI application to learn how I want to build my text classifier and I successfully built and saved a model using the GUI.

Now, I want to implement the classifier in Java code. But I can't seem to set the stopwords and tokenizer settings of the StringToWordVector filter in code like I did in the GUI. (See the screenshot.)

enter image description here

(Of course, without the stopwords handler set to NULL.)

I am aware that I can load the model I created and saved from the GUI, into the code. But I need to implement the filter in Java.

I tried to use the code here: Different results in Weka GUI and Weka via Java code Mainly, this part (of course, after changing the path):

 String opt = "-W -P 0 -M 5.0 -norm 1.0 -lnorm 2.0 -lowercase -stoplist -        stopwords C:\Users\Fernando\workspace\GPCommentsAnalyzer\pt-br_stopwords.dat -tokenizer "weka.core.tokenizers.NGramTokenizer -delimiters ' \r\n\t.,;:\\'\"()?!\' -max 2 -min 1" -stemmer weka.core.stemmers.NullStemmer";

But, it still doesn't work.

I can't find any documentation about this topic anywhere. Any help would be much appreciated!

(I am using Weka version 3.7.12)

Atilla Ozgur · Accepted Answer

Set your configuration using GUI, then use copy configuration to clipboard option in context menu.

Copy config to clipboard

Weka StringToWordVector Filter - Implementation in Java

Answers (1)

Related Questions