Reputation: 13
This is my numpy array:
array([[['a','c'], [1,3]],
[['b','d'], [2, 4]]], dtype=object)
Expected dataframe:
columns1 | column2 |
---|---|
['a','c'] | [1, 3] |
['b','d'] | [2, 4] |
Getting this error
ValueError: Must pass 2-d input. shape=(2, 2, 2)
with `pd.DataFrame((c), columns=['q','s'])
Upvotes: 0
Views: 6503
Reputation: 1822
The problem is, that you consider your numpy array as a 2 by 2 matrix with lists as their elements. But what actually happens is that these lists are casted into an additional dimensions.
I don't know if there is a more direct solution, but here is one way to trick numpy into the desired structure:
>>> c=numpy.array([[['a', 'c'], [1]], [['b', 'd'], [2,4]]])
>>> c
array([[list(['a', 'c']), list([1])],
[list(['b', 'd']), list([2, 4])]], dtype=object)
>>> c[0,1].append(3)
>>> c
array([[list(['a', 'c']), list([1, 3])],
[list(['b', 'd']), list([2, 4])]], dtype=object)
>>> pd.DataFrame(c, columns=['q','s'])
q s
0 [a, c] [1, 3]
1 [b, d] [2, 4]
In the first step, one of the inner lists has a different length, therefore numpy cannot broadcast this to an array with 3 dimensions and returns your desired 2 dimensional array with lists as its elements instead. Next I simply append the missing value.
An alternative could be to simply skip the numpy array and pass the list of lists of lists directly to pandas:
>>> c=[[['a', 'c'], [1, 3]], [['b', 'd'], [2,4]]]
>>> pd.DataFrame(c, columns=['q','s'])
q s
0 [a, c] [1, 3]
1 [b, d] [2, 4]
Upvotes: 1
Reputation: 516
EDIT 2:
import pandas as pd
a = pd.Series([['a','c'],[1,3]])
b = pd.Series([['b','d'],[2,4]])
df = pd.DataFrame([a,b])
print(df)
EDIT: Nope you need those extra brackets, my bad. But if you do this, it works:
data = np.array([[['a,'c'], [1,3]],
[['b','d'], [2, 4]]], dtype=object)
df = pd.DataFrame(data[0])
Original:
I think you might have one set of square brackets too many and thus, it sees three dimensions instead of the two you're seeking.
You can try:
np.array([['a,'c'], [1,3]],
[['b','d'], [2, 4]], dtype=object)
Upvotes: 0