Reputation: 951
I have a list of numpy arrays that I'm trying to convert to DataFrame. Each array should be a row of the dataframe.
Using pd.DataFrame() isn't working. It always gives the error: ValueError: Must pass 2-d input.
Is there a better way to do this?
This is my current code:
list_arrays = [ array([[0, 0, 0, 1, 0, 0, 0, 0, 00]], dtype='uint8'),
array([[0, 0, 3, 2, 0, 0, 0, 0, 00]], dtype='uint8')
]
d = pd.DataFrame(list_arrays)
ValueError: Must pass 2-d input
Upvotes: 23
Views: 77321
Reputation: 323226
You can using pd.Series
pd.Series(l).apply(lambda x : pd.Series(x[0]))
Out[294]:
0 1 2 3 4 5 6 7 8
0 0 0 0 1 0 0 0 0 0
1 0 0 3 2 0 0 0 0 0
Upvotes: 4
Reputation: 210832
Option 1:
In [143]: pd.DataFrame(np.concatenate(list_arrays))
Out[143]:
0 1 2 3 4 5 6 7 8
0 0 0 0 1 0 0 0 0 0
1 0 0 3 2 0 0 0 0 0
Option 2:
In [144]: pd.DataFrame(list(map(np.ravel, list_arrays)))
Out[144]:
0 1 2 3 4 5 6 7 8
0 0 0 0 1 0 0 0 0 0
1 0 0 3 2 0 0 0 0 0
Why do I get:
ValueError: Must pass 2-d input
I think pd.DataFrame()
tries to convert it to NDArray like as follows:
In [148]: np.array(list_arrays)
Out[148]:
array([[[0, 0, 0, 1, 0, 0, 0, 0, 0]],
[[0, 0, 3, 2, 0, 0, 0, 0, 0]]], dtype=uint8)
In [149]: np.array(list_arrays).shape
Out[149]: (2, 1, 9) # <----- NOTE: 3D array
Upvotes: 28
Reputation: 294218
pd.DataFrame(sum(map(list, list_arrays), []))
0 1 2 3 4 5 6 7 8
0 0 0 0 1 0 0 0 0 0
1 0 0 3 2 0 0 0 0 0
pd.DataFrame(np.row_stack(list_arrays))
0 1 2 3 4 5 6 7 8
0 0 0 0 1 0 0 0 0 0
1 0 0 3 2 0 0 0 0 0
Upvotes: 8
Reputation: 164623
Here is one way.
import numpy as np, pandas as pd
lst = [np.array([[0, 0, 0, 1, 0, 0, 0, 0, 0]], dtype=int),
np.array([[0, 0, 3, 2, 0, 0, 0, 0, 0]], dtype=int)]
df = pd.DataFrame(np.vstack(lst))
# 0 1 2 3 4 5 6 7 8
# 0 0 0 0 1 0 0 0 0 0
# 1 0 0 3 2 0 0 0 0 0
Upvotes: 3