Rainier Villanueva
Rainier Villanueva

Reputation: 31

scikit-learn: Expected 2D array, got 1D array instead

I am currently trying to train my dataset and is currently trying to follow the steps at the link. Link here

I keep getting errors like

ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. and so on..]. 

This is the code that I am trying to test and train using sci-kit learn:

import numpy as np
from sklearn.svm import LinearSVC
import os
import cv2
import joblib

# Generate training set
TRAIN_PATH = 'Try/'
list_folder = os.listdir(TRAIN_PATH)
trainset = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TRAIN_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TRAIN_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        trainset.append(im)
# Labeling for trainset
train_label = []
for i in range(0,1): #in the dataset i currently have a 0 folder and a 1 folder
    temp = 400*[i] #400 images in 1 folder
    train_label += temp

# Generate testing set
TEST_PATH = 'Test/'
list_folder = os.listdir(TEST_PATH)
testset = []
test_label = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TEST_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TEST_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        testset.append(im)
        test_label.append(folder)
trainset = np.reshape(trainset, (800, -1)) #800 because total number of images

# Create an linear SVM object
clf = LinearSVC()

# Perform the training
clf.fit(train_label, testset)
print("Training finished successfully")

# Testing
testset = np.reshape(testset, (len(testset), -1))
y = clf.predict(testset)
print("Testing accuracy: " + str(clf.score(testset, test_label)))

This is the error message:

 Traceback (most recent call last):   File
 "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\Thesis\Test.py",
 line 43, in <module>
     clf.fit(train_label, testset)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\svm\classes.py",
 line 229, in fit
     accept_large_sparse=False)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
 line 756, in check_X_y
     estimator=estimator)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
 line 552, in check_array
     "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or
 array.reshape(1, -1) if it contains a single sample.

Upvotes: 1

Views: 907

Answers (1)

yatu
yatu

Reputation: 88226

Hard to tell with such little information, but it might have to do with the parameters clf.fit() is receiving. It looks like you are feeding it the label train_label before the actual training data X.

If you have a look at the documentation, the order is as follows:

fit(X, y[, sample_weight]) Fit the SVM model according to the given training data.

Use instead:

clf.fit(testset, train_label)

Upvotes: 2

Related Questions