user2567719
user2567719

Reputation: 23

shape of pandas dataframe to 3d array

I want to convert pandas dataframe to 3d array, but cannot get the real shape of the 3d array:

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][3:]=1
df['a'][:3]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
a3d.shape
(2,)

But, when I set as this, I can get the shape

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][2:]=1
df['a'][:2]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))
a3d.shape
(2,2,5)

Is there some thing wrong with the code? Thanks!

Upvotes: 2

Views: 2958

Answers (1)

Ben.T
Ben.T

Reputation: 29635

Nothing wrong with the code, it's because in the first case, you don't have a 3d array. By definition of an N-d array (here 3d), first two lines explain that each dimension must have the same size. In the first case:

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][3:]=1
df['a'][:3]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix))) 

You have a 1-d array of size 2 (it's what a3d.shape shows you) which contains 2-d array of shape (1,5) and (3,5)

a3d[0].shape
Out[173]: (1, 5)
a3d[1].shape
Out[174]: (3, 5)

so both elements in the first dimension of what you call a3d does not have the same size, and can't be considered as other dimensions of this ndarray.

While in the second case,

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df['a'][2:]=1
df['a'][:2]=2
a3d = np.array(list(df.groupby('a').apply(pd.DataFrame.as_matrix)))

a3d[0].shape
Out[176]: (2, 5)
a3d[1].shape
Out[177]: (2, 5)

both elements of your first dimension have the same size, so a3d is a 3-d array.

Upvotes: 1

Related Questions