Reputation: 7879
I am attempting to replicate the csv to arff instructions found here. My code is copied below. The resultant arff prints the attributes section correctly. However, there is nothing under the "@data" section:
Code:
public class CsvToArff {
/**
* takes 2 arguments:
* - CSV input file
* - ARFF output file
*/
public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.out.println("\nUsage: CSV2Arff <input.csv> <output.arff>\n");
System.exit(1);
}
// load CSV
CSVLoader loader = new CSVLoader();
loader.setFieldSeparator(";");
loader.setNominalAttributes("2,5,8,10");
loader.setNoHeaderRowPresent(false);
loader.setSource(new File(args[0]));
loader.getStructure();
Instances data = loader.getDataSet();
// save ARFF
ArffSaver saver = new ArffSaver();
saver.setInstances(data);
saver.setFile(new File(args[1]));
saver.setDestination(new File(args[1]));
saver.writeBatch();
}
}
CSV File:
PrevPause;PrevPOS;PrevLength;WordPause;WordPOS;WordLength;NextPause;NextPOS;NextLength;Location
625;"JJ";4;156;"NN";4;1234;"FW";1;"OUT"
156;"NN";4;1234;"FW";1;187;"NN";4;"OUT"
1234;"FW";1;187;"NN";4;188;"VBD";3;"OUT"
Resultant arff:
@relation mwe_pred_debug
@attribute PrevPause numeric
@attribute PrevPOS {JJ,NN,FW}
@attribute PrevLength numeric
@attribute WordPause numeric
@attribute WordPOS {NN,FW}
@attribute WordLength numeric
@attribute NextPause numeric
@attribute NextPOS {FW,NN,VBD}
@attribute NextLength numeric
@attribute Location {OUT}
@data
Any idea why the last section is blank?
Upvotes: 1
Views: 505
Reputation: 2295
It appears that the setFieldSeparator(String) and setNoHeaderRowPresent(boolean) functions have recently been added to the CSVLoader, and are not currently in the current stable version (3.6). Perhaps this is something that could be raised with the Weka Development team.
As an alternative, you could change the semi-colons to commas in your csv and process the document as shown in your above tutorial. The sample appeared to convert correctly using the data sample and tutorial source given in your question.
Hope this helps!
Upvotes: 1