Peeyush
Peeyush

Reputation: 6464

How to train svm in scikit learn from training data present in a csv file

I have the training data in a CSV file whose first element is the result and the rest of the elements make the feature vector.

I was using Weka to train and test various algorithms on this training data. But now I want to use the trained model multiple times to test for a feature vector which is not a part of the training data and I do not have any idea on how to do it. I think that I may be able to do it by using scikit-learn. Please provide some help.

Upvotes: 0

Views: 3740

Answers (1)

ogrisel
ogrisel

Reputation: 40159

Just slice the data, for instance for a classification problem:

>>> import numpy as np
>>> from sklearn.ensemble import ExtraTreesClassifier

>>> data_train = np.loadtxt('data_train.csv', delimiter=',')
>>> X = data_train[:, 1:]
>>> y = data_train[:, 0].astype(np.int)
>>> clf = ExtraTreesClassifier(n_estimators=100).fit(X, y)

Then make prediction on the test data that does not have the target label in the first column:

>>> data_test = np.loadtxt('data_test.csv', delimiter=',')
>>> print(clf.predict(data_test))

Upvotes: 5

Related Questions