Claire_L
Claire_L

Reputation: 13

Python Numpy ValueError: Must pass 2-d input. shape=(2, 2, 2)

This is my numpy array:

array([[['a','c'], [1,3]],

[['b','d'], [2, 4]]], dtype=object)

Expected dataframe:

columns1 column2
['a','c'] [1, 3]
['b','d'] [2, 4]

Getting this error

ValueError: Must pass 2-d input. shape=(2, 2, 2)
 with `pd.DataFrame((c), columns=['q','s'])

Upvotes: 0

Views: 6503

Answers (2)

Feodoran
Feodoran

Reputation: 1822

The problem is, that you consider your numpy array as a 2 by 2 matrix with lists as their elements. But what actually happens is that these lists are casted into an additional dimensions.

I don't know if there is a more direct solution, but here is one way to trick numpy into the desired structure:

>>> c=numpy.array([[['a', 'c'], [1]], [['b', 'd'], [2,4]]])
>>> c
array([[list(['a', 'c']), list([1])],
       [list(['b', 'd']), list([2, 4])]], dtype=object)
>>> c[0,1].append(3)
>>> c
array([[list(['a', 'c']), list([1, 3])],
       [list(['b', 'd']), list([2, 4])]], dtype=object)
>>> pd.DataFrame(c, columns=['q','s'])
        q       s
0  [a, c]  [1, 3]
1  [b, d]  [2, 4]

In the first step, one of the inner lists has a different length, therefore numpy cannot broadcast this to an array with 3 dimensions and returns your desired 2 dimensional array with lists as its elements instead. Next I simply append the missing value.

An alternative could be to simply skip the numpy array and pass the list of lists of lists directly to pandas:

>>> c=[[['a', 'c'], [1, 3]], [['b', 'd'], [2,4]]]
>>> pd.DataFrame(c, columns=['q','s'])
        q       s
0  [a, c]  [1, 3]
1  [b, d]  [2, 4]

Upvotes: 1

Chaos_Is_Harmony
Chaos_Is_Harmony

Reputation: 516

EDIT 2:

import pandas as pd


a = pd.Series([['a','c'],[1,3]])
b = pd.Series([['b','d'],[2,4]])

df = pd.DataFrame([a,b])

print(df)


EDIT: Nope you need those extra brackets, my bad. But if you do this, it works:

data = np.array([[['a,'c'], [1,3]],

[['b','d'], [2, 4]]], dtype=object)

df = pd.DataFrame(data[0])

Original:

I think you might have one set of square brackets too many and thus, it sees three dimensions instead of the two you're seeking.

You can try:

np.array([['a,'c'], [1,3]],

[['b','d'], [2, 4]], dtype=object)

Upvotes: 0

Related Questions