Reputation: 23
I want to convert pandas dataframe to 3d array, but cannot get the real shape of the 3d array:
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][3:]=1
df['a'][:3]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
a3d.shape
(2,)
But, when I set as this, I can get the shape
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][2:]=1
df['a'][:2]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
a3d.shape
(2,2,5)
Is there some thing wrong with the code? Thanks!
Upvotes: 2
Views: 2958
Reputation: 29635
Nothing wrong with the code, it's because in the first case, you don't have a 3d array. By definition of an N-d array (here 3d), first two lines explain that each dimension must have the same size. In the first case:
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][3:]=1
df['a'][:3]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
You have a 1-d array of size 2 (it's what a3d.shape
shows you) which contains 2-d array of shape (1,5) and (3,5)
a3d[0].shape
Out[173]: (1, 5)
a3d[1].shape
Out[174]: (3, 5)
so both elements in the first dimension of what you call a3d
does not have the same size, and can't be considered as other dimensions of this ndarray
.
While in the second case,
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][2:]=1
df['a'][:2]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
a3d[0].shape
Out[176]: (2, 5)
a3d[1].shape
Out[177]: (2, 5)
both elements of your first dimension have the same size, so a3d
is a 3-d array.
Upvotes: 1