lakshmi lakshmi
lakshmi lakshmi

Reputation: 1

how to fix the incorrect dimension of numpy array

Iam working on binary image classification problem using supervised machine learning. I used svm classifier algorithm. First I created a numpy array for normalized color images in a variable X,whose shape is (17500,32,32,3). Then after data splitting, X_train has the shape (14000,32,32,3) and dimension 4 and y_train has the shape (14000,2) and dimension 2.

clf.fit(X_train,y_train)

After running this code I got an value error: Found array of dimension 4 estimator has dimension <=2.

Thanks in advance!

Upvotes: 0

Views: 510

Answers (2)

avr_dude
avr_dude

Reputation: 242

The technique is called Dimensionality Reduction. Mapping data from high dimensional space into lower dimensions. The most commonly used technique is the Principal Component Analysis(PCA). You can learn about them through the following links :

Upvotes: 0

shaivikochar
shaivikochar

Reputation: 440

If you are using scikit-learn SVM classification algorithm, it expects 2D arrays of shape (n_samples, n_features) for the training dataset for a SVM fit function.

The dataset you are passing in is a 4D array, therefore you need to reshape the array into a 2D array.

Example:

from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# To apply a classifier, we need to flatten the image, to
# turn the data in a (samples, feature) matrix, 
# assuming data is numpy array of shape (17500, 32, 32, 3), convert to shape (17500, 3072).
n_samples = len(data)
data_reshape = data.reshape((n_samples, -1))

# Split data into train and test subsets
X_train, X_test, y_train, y_test = train_test_split(data_reshape, labels, 
                                                    test_size=0.2)
clf.fit(X_train,y_train)

Upvotes: 2

Related Questions