Reputation:
I have a Pandas DataFrame that looks like the following:
data
date signal
2012-11-01 a 0.04
b 0.03
2012-12-01 a -0.01
b 0.00
2013-01-01 a -0.00
b -0.01
I am trying to get only the last row based on the first level of the multiindex, which is date in this case.
2013-01-01 a -0.00
b -0.01
The first level index is datetime. What would be the most elegant way to select the last row?
Upvotes: 5
Views: 7336
Reputation: 375585
One way is to access the MultiIndex's levels directly (and use the last one):
In [11]: df.index.levels
Out[11]: [Index([bar, baz, foo, qux], dtype=object), Index([one, two], dtype=object)]
In [12]: df.index.levels[0][-1]
Out[12]: 'qux'
And select these rows with ix
:
In [13]: df.ix[df.index.levels[0][-1]]
Out[13]:
0 1 2 3
one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
In [14]: df.ix[df.index.levels[0][-1]:]
Out[14]:
0 1 2 3
qux one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
(Using @Jeff's example DataFrame.)
Perhaps a more elegant way is to use tail
(if you knew there would always be two):
In [15]: df.tail(2)
Out[15]:
0 1 2 3
qux one 1.225973 -0.703952 0.265889 1.069345
two -1.521503 0.024696 0.109501 -1.584634
Upvotes: 9
Reputation: 17570
If you have a dataframe df
with a MultiIndex already defined, then:
df2 = df.ix[df.index[len(df.index)-1][0]]
would also work.
Upvotes: 0
Reputation: 128988
In 0.11 (coming this week), this is a reasonable way to do this
In [50]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
.....: np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
In [51]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
In [52]: df
Out[52]:
0 1 2 3
bar one -1.798562 0.852583 -0.148094 -2.107990
two -1.091486 -0.748130 0.519758 2.621751
baz one -1.257548 0.210936 -0.338363 -0.141486
two -0.810674 0.323798 -0.030920 -0.510224
foo one -0.427309 0.933469 -1.259559 -0.771702
two -2.060524 0.795388 -1.458060 -1.762406
qux one -0.574841 0.023691 -1.567137 0.462715
two 0.936323 0.346049 -0.709112 0.045066
In [53]: df.loc['qux'].iloc[[-1]]
Out[53]:
0 1 2 3
two 0.936323 0.346049 -0.709112 0.045066
This will work in 0.10.1
In [63]: df.ix['qux'].ix[-1]
Out[63]:
0 0.936323
1 0.346049
2 -0.709112
3 0.045066
Name: two, dtype: float64
And another way (this works in 0.10.1) as well
In [59]: df.xs(('qux','two'))
Out[59]:
0 0.936323
1 0.346049
2 -0.709112
3 0.045066
Name: (qux, two), dtype: float64
Upvotes: 2