giorgio79
giorgio79

Reputation: 4189

How to create a scikit learn dataset?

I have an array where the first columns are classes (in integer form), and the rest of the columns are features.

SG like this

1,0,34,23,2
0,0,21,11,0
3,11,2,11,1

How can I turn this into a scikit compatible dataset, so I can call sg like mydataset = datasets.load_mydataset()?

Upvotes: 2

Views: 8183

Answers (1)

Sagar Waghmode
Sagar Waghmode

Reputation: 777

You can simply use pandas. e.g. If you have copied your dataset to dataset.csv file. Just label the columns in csv file appropriately.

In [1]: import pandas as pd

In [2]: df = pd.read_csv('temp.csv')

In [3]: df
Out[3]: 
   Label  f1  f2  f3  f4
0      1   0  34  23   2
1      0   0  21  11   0
2      3  11   2  11   1

In [4]: y_train= df['Label']

In [5]: x_train = df.drop('Label', axis=1)

In [6]: x_train
Out[6]: 
   f1  f2  f3  f4
0   0  34  23   2
1   0  21  11   0
2  11   2  11   1

In [7]: y_train
Out[7]: 
0    1
1    0
2    3

Upvotes: 5

Related Questions