Reputation: 31
I am currently trying to train my dataset and is currently trying to follow the steps at the link. Link here
I keep getting errors like
ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. and so on..].
This is the code that I am trying to test and train using sci-kit learn:
import numpy as np
from sklearn.svm import LinearSVC
import os
import cv2
import joblib
# Generate training set
TRAIN_PATH = 'Try/'
list_folder = os.listdir(TRAIN_PATH)
trainset = []
for folder in list_folder:
flist = os.listdir(os.path.join(TRAIN_PATH, folder))
for f in flist:
im = cv2.imread(os.path.join(TRAIN_PATH, folder, f))
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
im = cv2.resize(im, (36,36))
trainset.append(im)
# Labeling for trainset
train_label = []
for i in range(0,1): #in the dataset i currently have a 0 folder and a 1 folder
temp = 400*[i] #400 images in 1 folder
train_label += temp
# Generate testing set
TEST_PATH = 'Test/'
list_folder = os.listdir(TEST_PATH)
testset = []
test_label = []
for folder in list_folder:
flist = os.listdir(os.path.join(TEST_PATH, folder))
for f in flist:
im = cv2.imread(os.path.join(TEST_PATH, folder, f))
im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
im = cv2.resize(im, (36,36))
testset.append(im)
test_label.append(folder)
trainset = np.reshape(trainset, (800, -1)) #800 because total number of images
# Create an linear SVM object
clf = LinearSVC()
# Perform the training
clf.fit(train_label, testset)
print("Training finished successfully")
# Testing
testset = np.reshape(testset, (len(testset), -1))
y = clf.predict(testset)
print("Testing accuracy: " + str(clf.score(testset, test_label)))
This is the error message:
Traceback (most recent call last): File
"C:\Users\hp\AppData\Local\Programs\Python\Python37-32\Thesis\Test.py",
line 43, in <module>
clf.fit(train_label, testset) File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\svm\classes.py",
line 229, in fit
accept_large_sparse=False) File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
line 756, in check_X_y
estimator=estimator) File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
line 552, in check_array
"if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or
array.reshape(1, -1) if it contains a single sample.
Upvotes: 1
Views: 907
Reputation: 88226
Hard to tell with such little information, but it might have to do with the parameters clf.fit()
is receiving. It looks like you are feeding it the label train_label
before the actual training data X
.
If you have a look at the documentation, the order is as follows:
fit(X, y[, sample_weight]) Fit the SVM model according to the given training data.
Use instead:
clf.fit(testset, train_label)
Upvotes: 2