Reputation: 247

Fitting sklearn's SVM classifier with data from a pandas DataFrame

I'm trying to use SVM but I dont know how to fit the model when I am using pandas data frame. If my data looks like this:

df = pd.DataFrame({"x": ['011', '100', '111'] , "y": [0,1,0]})
df.x.apply(lambda x: np.array(list(map(int,x))))
>>>df
    x           y
0   [0, 1, 1]   0
1   [1, 0, 0]   1
2   [1, 1, 1]   0

If I try to fit the model this way:

clf = svm.SVC().fit(df.x, df.y)

I am getting this error:

ValueError: setting an array element with a sequence.

What is the correct way to fit the SVM using this data frame?

Upvotes: 5

Answers (3)

Daksh Thakur

Reputation: 1

import numpy as np
from sklearn.svm import SVC

# Convert your data frame's columns into arrays
features = df['x'].to_numpy()
labels = df['y'].to_numpy()

# feed into your classifier 
SVC().fit(features,labels)

Upvotes: 0

Keiku

Reputation: 8813

Another solution is the code below.

import pandas as pd
import numpy as np

from sklearn.svm import SVC

df = pd.DataFrame({"x": ['011', '100', '111'] , "y": [0,1,0]})
x = df.x.apply(lambda x: pd.Series(list(x)))
x
# Out[2]:
#    0  1  2
# 0  0  1  1
# 1  1  0  0
# 2  1  1  1

SVC().fit(x, df.y)
# Out[3]:
# SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
#   decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
#   max_iter=-1, probability=False, random_state=None, shrinking=True,
#   tol=0.001, verbose=False)

Upvotes: 2

cs95

Reputation: 402932

df = pd.DataFrame({"x": ['011', '100', '111'] , "y": [0,1,0]})
df.x = df.x.apply(lambda x: list(map(int,x)))

df
           x  y
0  [0, 1, 1]  0
1  [1, 0, 0]  1
2  [1, 1, 1]  0

df.x is a column of arrays. This probably isn't the best way to store data, and it would seem sklearn isn't very good at understanding it. It would be simpler to convert everything to a list of lists and pass that to SVC. Try this:

x = df.x.tolist()
print(x)
[[0, 1, 1], [1, 0, 0], [1, 1, 1]]
SVC().fit(x, df.y)

Upvotes: 7

Fitting sklearn&#39;s SVM classifier with data from a pandas DataFrame

Answers (3)

Related Questions

Fitting sklearn's SVM classifier with data from a pandas DataFrame