Adam_G
Adam_G

Reputation: 7879

Weka ArffSaver Not Writing Data

I am attempting to replicate the csv to arff instructions found here. My code is copied below. The resultant arff prints the attributes section correctly. However, there is nothing under the "@data" section:

Code:

public class CsvToArff {
  /**
   * takes 2 arguments:
   * - CSV input file
   * - ARFF output file
   */
  public static void main(String[] args) throws Exception {
    if (args.length != 2) {
      System.out.println("\nUsage: CSV2Arff <input.csv> <output.arff>\n");
      System.exit(1);
    }

    // load CSV
    CSVLoader loader = new CSVLoader();
    loader.setFieldSeparator(";");
    loader.setNominalAttributes("2,5,8,10");
    loader.setNoHeaderRowPresent(false);
    loader.setSource(new File(args[0]));
    loader.getStructure();
    Instances data = loader.getDataSet();

    // save ARFF
    ArffSaver saver = new ArffSaver();
    saver.setInstances(data);
    saver.setFile(new File(args[1]));
    saver.setDestination(new File(args[1]));
    saver.writeBatch();
  }
}

CSV File:

PrevPause;PrevPOS;PrevLength;WordPause;WordPOS;WordLength;NextPause;NextPOS;NextLength;Location
625;"JJ";4;156;"NN";4;1234;"FW";1;"OUT"
156;"NN";4;1234;"FW";1;187;"NN";4;"OUT"
1234;"FW";1;187;"NN";4;188;"VBD";3;"OUT"

Resultant arff:

@relation mwe_pred_debug

@attribute PrevPause numeric
@attribute PrevPOS {JJ,NN,FW}
@attribute PrevLength numeric
@attribute WordPause numeric
@attribute WordPOS {NN,FW}
@attribute WordLength numeric
@attribute NextPause numeric
@attribute NextPOS {FW,NN,VBD}
@attribute NextLength numeric
@attribute Location {OUT}

@data

Any idea why the last section is blank?

Upvotes: 1

Views: 505

Answers (1)

Matthew Spencer
Matthew Spencer

Reputation: 2295

It appears that the setFieldSeparator(String) and setNoHeaderRowPresent(boolean) functions have recently been added to the CSVLoader, and are not currently in the current stable version (3.6). Perhaps this is something that could be raised with the Weka Development team.

As an alternative, you could change the semi-colons to commas in your csv and process the document as shown in your above tutorial. The sample appeared to convert correctly using the data sample and tutorial source given in your question.

Hope this helps!

Upvotes: 1

Related Questions