puk789
puk789

Reputation: 332

How to select all data from Data Frame?

I want to select all data inside a Data Frame (except index, column indices and right-most column - see the image below) and store it into a Series. This might be obvious, but I cannot get anything working. I have tried for example a = nai_data.ix[0:19]but it returns a new Data Frame again with all indices and I need just a Series of data. So I tried a = pd.Series(nai_data.ix[0:19]) but didnt help either. I am sure there must be a simple way to do this but cant find out. Any help appreciatedenter image description here

Upvotes: 0

Views: 304

Answers (1)

unutbu
unutbu

Reputation: 880199

Perhaps you are looking for stack(), which can be thought of as moving the column index into the row index:

In [12]: np.random.seed(2015)

In [13]: df = pd.DataFrame(np.random.randint(10, size=(3,4)))

In [14]: df
Out[14]: 
   0  1  2  3
0  2  2  9  6
1  8  5  7  8
2  0  6  7  8

In [15]: df.stack()
Out[15]: 
0  0    2
   1    2
   2    9
   3    6
1  0    8
   1    5
   2    7
   3    8
2  0    0
   1    6
   2    7
   3    8
dtype: int64

If you don't want the MultiIndex, call reset_index():

In [16]: df.stack().reset_index(drop=True)
Out[16]: 
0     2
1     2
2     9
3     6
4     8
5     5
6     7
7     8
8     0
9     6
10    7
11    8
dtype: int64

To select all but the last column, you could use df.iloc:

In [17]: df.iloc[:, :-1]
Out[17]: 
   0  1  2
0  2  2  9
1  8  5  7
2  0  6  7

In [18]: df.iloc[:, :-1].stack()
Out[18]: 
0  0    2
   1    2
   2    9
1  0    8
   1    5
   2    7
2  0    0
   1    6
   2    7
dtype: int64

Another way would be to slice and flatten the underlying NumPy array:

In [21]: df.values
Out[21]: 
array([[2, 2, 9, 6],
       [8, 5, 7, 8],
       [0, 6, 7, 8]])

In [22]: df.values[:, :-1]
Out[22]: 
array([[2, 2, 9],
       [8, 5, 7],
       [0, 6, 7]])

In [23]: df.values[:, :-1].ravel()
Out[23]: array([2, 2, 9, 8, 5, 7, 0, 6, 7])

and then just build the Series using this data:

In [24]: pd.Series(df.values[:, :-1].ravel())
Out[24]: 
0    2
1    2
2    9
3    8
4    5
5    7
6    0
7    6
8    7
dtype: int64

Upvotes: 1

Related Questions