Cerin
Cerin

Reputation: 64820

Scipy fit() returning error "Expected 2D array, got 1D array instead"

I'm trying to use some Scipy classifiers to classify a dataset like:

X = [
    [2.2580403973917715e-06, 7.637149660560025e-07, 2.57156353293851e-07, 1.0411090477468541e-06, 2.966875927679557e-07, 2.8407977825637697e-08, 1.0829349857886486e-09, 6.361589208033874e-08],
    [6.270357208864301e-10, 1.4976703048023198e-08, 1.6851513665698196e-08, 2.8145617066687096e-08, 1.2313171858773658e-07, 1.3792673608349409e-07, 4.78989580237296e-07, 4.7187858402735833e-07],
    ...
]
y = ['A', 'B', ...]

So I'm doing:

from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier()
clf.fit(X, y)

but this is throwing the exception:

ValueError: Expected 2D array, got 1D array instead:
array=[list([3.6368636083818036e-14, 3.392522830189899e-12, 9.89834613253366e-15, 1.2248850677983348e-12, 1.3550523047368496e-15, 3.5267405094168456e-13, 1.0774177675897417e-15, 2.7373831147145047e-13])
 list([1.2702936994261183e-07, 1.1302366112093968e-08, 3.601247032753103e-07, 1.2610271278453322e-06, 3.8544108074754034e-07, 3.7388057888913323e-07, 1.0565146699133778e-06, 3.4632712456302423e-07])
 list([6.663127039198243e-09, 1.592855782829962e-08, 2.5189316052885216e-09, 8.71344955078468e-08, 1.560205602695966e-08, 1.8025334695989781e-07, 1.1457211528200937e-06, 5.950518479674942e-07])
 ...
 list([1.8681641214886276e-08, 2.205503463622159e-08, 9.327746218326714e-10, 3.1025040738394077e-09, 2.5949152371447647e-09, 1.512181130670229e-11, 5.786442161657287e-10, 1.4137420397921863e-10])
 list([1.173538500657531e-05, 2.0979757955014606e-05, 0.00041950915503583496, 0.0005116279917528409, 0.0003344955721068041, 0.0003088865534818541, 0.0009895311928082445, 0.0008824911646422847])
 list([2.8243472997407836e-06, 4.503534390316763e-07, 6.076586515867012e-07, 3.1616543548189114e-07, 3.145241529895121e-07, 1.7389540865566907e-07, 1.7234379120699387e-07, 3.603187075520089e-08])].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

What does this mean? This question/error has been posted here before, but I don't see why my code is triggering this error. My X is a 2D array, not a 1D array, so I don't understand what I'm supposed to reshape. My data is essentially the same as the example here.

What am I doing wrong?

Upvotes: 0

Views: 662

Answers (1)

Cerin
Cerin

Reputation: 64820

The problem turned out to be that a single row at the end of my X array was not the same size as the other rows, and this broke sklearn's/Numpy's ability to convert that to a 2D Numpy array. That caused it to instead, convert it to a 1D Numpy array of lists, resulting in the error.

I stripped out the bad row, and that caused my X array to be converted to a proper 2D Numpy array.

Upvotes: 0

Related Questions