Jr.
Jr.

Reputation: 49

Using WEKA Filters in Java - Oversampling and Undersampling

I'm having an issue with finding out how to use WEKA filters in the java code. I've looked up help but it seems a little dated as I'm using WEKA 3.8.5 . I'm doing 3 test. Test 1: No Filter, Test 2: weka.filters.supervised.instance.SpreadSubsample -M 1.0 , and Test 3: weka.filters.supervised.instance.Resample -B 1.0 -Z 130.3.

If my research is correct I should import the filters like this. Now I'm lost on having "-M 1.0 " for SpreadSample(my under sampling Test) and "-B 1.0 -Z 130.3." for Resample(My oversampling test).

import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.filters.Filter;
import weka.filters.supervised.instance.Resample; 
import weka.filters.supervised.instance.SpreadSubsample;

And I have Test 1(my no filter Test) coded below

import java.io.FileReader;
import java.util.Random;
import weka.classifiers.Evaluation;
import weka.classifiers.trees.J48;
import weka.core.Instances;


public class Fraud {
    public static void main(String args[])
    {
  
        try {
  
            // Creating J48 classifier for the  tree
            J48 j48Classifier = new J48();
  
            // Setting the path for the dataset
            String FraudDataset = "C:\\Users\\Owner\\Desktop\\CreditCard\\CreditCard.arff";
            BufferedReader bufferedReader
            = new BufferedReader(
                new FileReader(FraudDataset));
            
            

        // Creating the data set instances
        Instances datasetInstances
            = new Instances(bufferedReader);

  
        datasetInstances.setClassIndex(
            datasetInstances.numAttributes() - 1);

        Evaluation evaluation
            = new Evaluation(datasetInstances);

        // Cross Validate Model. 10 Folds
        evaluation.crossValidateModel(
            j48Classifier, datasetInstances, 10,
            new Random(1));
        System.out.println(evaluation.toSummaryString(
            "\nResults", false));
        
        
        
    }

    // Catching exceptions
    catch (Exception e) {
        System.out.println("Error Occured!!!! \n"
                           + e.getMessage());
    }


    System.out.print("DT Successfully executed.");
}
    
}

The results of my code is:
Results
Correctly Classified Instances      284649               99.9445 %
Incorrectly Classified Instances       158                0.0555 %
Kappa statistic                          0.8257
Mean absolute error                      0.0008
Root mean squared error                  0.0232
Relative absolute error                 24.2995 %
Root relative squared error             55.9107 %
Total Number of Instances           284807     

DT Successfully executed.

Does anyone have an idea on how I can add the filters and the settings I want for the filters to the code for Test 2 and 3? Any help will be appreciated. I will run the 3 tests multiple times and compare the results. I want to see what works best of the 3.

Upvotes: 0

Views: 297

Answers (1)

fracpete
fracpete

Reputation: 2608

-M 1.0 and -B 1.0 -Z 130.3 are the options that you supply to the filters from the command-line. These filters implement the weka.core.OptionHandler interface, which offers the setOptions and getOptions methods.

For example, SpreadSubsample can be instantiated like this:

import weka.filters.supervised.instance.SpreadSubsample;
import weka.core.Utils;
...
SpreadSubsample spread = new SpreadSubsample();
// Utils.splitOptions generates an array from an option string
spread.setOptions(Utils.splitOptions("-M 1.0"));
// alternatively:
// spread.setOptions(new String[]{"-M", "1.0"});

In order to apply the filters, you should use the FilteredClassifier approach. E.g., for SpreadSubsample you would do something like this:

import weka.classifiers.meta.FilteredClassifier;
import weka.classifiers.trees.J48;
import weka.filters.supervised.instance.SpreadSubsample;
import weka.core.Utils;
...
// base classifier
J48 j48 = new J48();
// filter
SpreadSubsample spread = new SpreadSubsample();
spread.setOptions(Utils.splitOptions("-M 1.0"));
// meta-classifier
FilteredClassifier fc = new FilteredClassifier();
fc.setFilter(spread);
fc.setClassifier(j48);

And then evaluate the fc classifier object on your dataset.

Upvotes: 1

Related Questions