Learner
Learner

Reputation: 960

how to load .arff format file to matlab

Is there any package to load .arff format file into matlab? The .arff format is used in Weka for running machine learning algorithm.

Upvotes: 6

Views: 14883

Answers (6)

akashrajkn
akashrajkn

Reputation: 2315

If the methods mentioned above do not work, and header information is required, load the arff file in weka, then select save as option and save the data using csv file format.

Upvotes: -1

Václav Gerla
Václav Gerla

Reputation: 21

M = importdata('filename.arff');

very slow for large files, but it works (tested in MATLAB 2010b)

Upvotes: 2

user58419
user58419

Reputation: 341

If you only want to load a file stored in "arff" format into Matlab, and don't need any other functionality from Weka, just remove the header part of your "arff" file (those attribute definitions), and save the file as csv format (you should replace class values with a numeric equivalences), and then use the built-in "csvread" function of Matlab. This way there is no need to find a third party package.

Upvotes: 3

Amro
Amro

Reputation: 124543

Since Weka is a Java library, you can directly use the API it exposes to read ARFF files:

%## paths
WEKA_HOME = 'C:\Program Files\Weka-3-7';
javaaddpath([WEKA_HOME '\weka.jar']);
fName = [WEKA_HOME '\data\iris.arff'];

%## read file
loader = weka.core.converters.ArffLoader();
loader.setFile( java.io.File(fName) );
D = loader.getDataSet();
D.setClassIndex( D.numAttributes()-1 );

%## dataset
relationName = char(D.relationName);
numAttr = D.numAttributes;
numInst = D.numInstances;

%## attributes
%# attribute names
attributeNames = arrayfun(@(k) char(D.attribute(k).name), 0:numAttr-1, 'Uni',false);

%# attribute types
types = {'numeric' 'nominal' 'string' 'date' 'relational'};
attributeTypes = arrayfun(@(k) D.attribute(k-1).type, 1:numAttr);
attributeTypes = types(attributeTypes+1);

%# nominal attribute values
nominalValues = cell(numAttr,1);
for i=1:numAttr
    if strcmpi(attributeTypes{i},'nominal')
        nominalValues{i} = arrayfun(@(k) char(D.attribute(i-1).value(k-1)), 1:D.attribute(i-1).numValues, 'Uni',false);
    end
end

%## instances
data = zeros(numInst,numAttr);
for i=1:numAttr
    data(:,i) = D.attributeToDoubleArray(i-1);
end

%## visualize data
parallelcoords(data(:,1:end-1), ...
    'Group',nominalValues{end}(data(:,end)+1), ...
    'Labels',attributeNames(1:end-1))
title(relationName)

parallel_coordinates

You can even directly use its functionality from MATLAB. An example:

%## classification
classifier = weka.classifiers.trees.J48();
classifier.buildClassifier( D );
fprintf('Classifier: %s %s\n%s', ...
    char(classifier.getClass().getName()), ...
    char(weka.core.Utils.joinOptions(classifier.getOptions())), ...
    char(classifier.toString()) )

The output C4.5 decision tree:

Classifier: weka.classifiers.trees.J48 -C 0.25 -M 2
J48 pruned tree
------------------

petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
|   petalwidth <= 1.7
|   |   petallength <= 4.9: Iris-versicolor (48.0/1.0)
|   |   petallength > 4.9
|   |   |   petalwidth <= 1.5: Iris-virginica (3.0)
|   |   |   petalwidth > 1.5: Iris-versicolor (3.0/1.0)
|   petalwidth > 1.7: Iris-virginica (46.0/1.0)

Number of Leaves  :     5

Size of the tree :  9

Upvotes: 8

Matteo De Felice
Matteo De Felice

Reputation: 1518

Yes, there are a few MATLAB interfaces for WEKA files on MATLAB File Exchange, I normally use this one: http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface where you have a saveARFF() and a loadARFF() functions.

Upvotes: 4

Richie Cotton
Richie Cotton

Reputation: 121057

Searching the MATLAB Central File Exchange reveals some possibilities. In particular, the results from Durga Lal Shrestha and Gerald Augusto Corzo Perez look promising, though I haven't tried either.

Upvotes: -1

Related Questions