Reputation: 960
Is there any package to load .arff format file into matlab? The .arff format is used in Weka for running machine learning algorithm.
Upvotes: 6
Views: 14883
Reputation: 2315
If the methods mentioned above do not work, and header information is required, load the arff file in weka, then select save as option and save the data using csv file format.
Upvotes: -1
Reputation: 21
M = importdata('filename.arff');
very slow for large files, but it works (tested in MATLAB 2010b)
Upvotes: 2
Reputation: 341
If you only want to load a file stored in "arff" format into Matlab, and don't need any other functionality from Weka, just remove the header part of your "arff" file (those attribute definitions), and save the file as csv format (you should replace class values with a numeric equivalences), and then use the built-in "csvread" function of Matlab. This way there is no need to find a third party package.
Upvotes: 3
Reputation: 124543
Since Weka is a Java library, you can directly use the API it exposes to read ARFF files:
%## paths
WEKA_HOME = 'C:\Program Files\Weka-3-7';
javaaddpath([WEKA_HOME '\weka.jar']);
fName = [WEKA_HOME '\data\iris.arff'];
%## read file
loader = weka.core.converters.ArffLoader();
loader.setFile( java.io.File(fName) );
D = loader.getDataSet();
D.setClassIndex( D.numAttributes()-1 );
%## dataset
relationName = char(D.relationName);
numAttr = D.numAttributes;
numInst = D.numInstances;
%## attributes
%# attribute names
attributeNames = arrayfun(@(k) char(D.attribute(k).name), 0:numAttr-1, 'Uni',false);
%# attribute types
types = {'numeric' 'nominal' 'string' 'date' 'relational'};
attributeTypes = arrayfun(@(k) D.attribute(k-1).type, 1:numAttr);
attributeTypes = types(attributeTypes+1);
%# nominal attribute values
nominalValues = cell(numAttr,1);
for i=1:numAttr
if strcmpi(attributeTypes{i},'nominal')
nominalValues{i} = arrayfun(@(k) char(D.attribute(i-1).value(k-1)), 1:D.attribute(i-1).numValues, 'Uni',false);
end
end
%## instances
data = zeros(numInst,numAttr);
for i=1:numAttr
data(:,i) = D.attributeToDoubleArray(i-1);
end
%## visualize data
parallelcoords(data(:,1:end-1), ...
'Group',nominalValues{end}(data(:,end)+1), ...
'Labels',attributeNames(1:end-1))
title(relationName)
You can even directly use its functionality from MATLAB. An example:
%## classification
classifier = weka.classifiers.trees.J48();
classifier.buildClassifier( D );
fprintf('Classifier: %s %s\n%s', ...
char(classifier.getClass().getName()), ...
char(weka.core.Utils.joinOptions(classifier.getOptions())), ...
char(classifier.toString()) )
The output C4.5 decision tree:
Classifier: weka.classifiers.trees.J48 -C 0.25 -M 2
J48 pruned tree
------------------
petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
| petalwidth <= 1.7
| | petallength <= 4.9: Iris-versicolor (48.0/1.0)
| | petallength > 4.9
| | | petalwidth <= 1.5: Iris-virginica (3.0)
| | | petalwidth > 1.5: Iris-versicolor (3.0/1.0)
| petalwidth > 1.7: Iris-virginica (46.0/1.0)
Number of Leaves : 5
Size of the tree : 9
Upvotes: 8
Reputation: 1518
Yes, there are a few MATLAB interfaces for WEKA files on MATLAB File Exchange, I normally use this one: http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface where you have a saveARFF() and a loadARFF() functions.
Upvotes: 4
Reputation: 121057
Searching the MATLAB Central File Exchange reveals some possibilities. In particular, the results from Durga Lal Shrestha and Gerald Augusto Corzo Perez look promising, though I haven't tried either.
Upvotes: -1