S Williams
S Williams

Reputation: 151

Loading labels and data from csv into sklearn

I have a csv file with rows of classifications/labels followed by the data associated with them:

  cat, 0, 1, 45, 23, ...
  dog, 1, 5, 75, 23, ...
  cat, 3, 4, 63, 24, ...
  cat, 0, 1, 44, 23, ...
  dog, 7, 3, 25, 4, ...

How can I load the csv file into sklearn?

Edit: or do I need to replace the labels with number equivalents? I.e. dog = 1, cat = 2, etc.

Upvotes: 0

Views: 547

Answers (1)

Fortunato
Fortunato

Reputation: 567

* Edited based in Vivek's comment

You could use pandas. Here is an example of feeding the data into a simple random forest classifier:

import pandas as pd
from sklearn.ensemble import RandomForestClassifier

data = pd.read_csv('/path/to/data')

Y = data[[0]]  # labels
X = data.drop([0], axis = 1)  # features

clf = RandomForestClassifier()
clf.fit(X, Y)

Upvotes: 2

Related Questions