Reputation: 1161
I have been trying to figure out the best way to deal with a multiIndex. Especially when I want to access some values on the second level of the multiIndex. For example:
df = pd.DataFrame([np.random.randint(3,size=20),np.random.rand(20)]).T
df.columns = ['A','B']
g = df.groupby('A').describe()
Let's say I'm trying to look at the mean values of this output. I can do something like this:
idx = [True if a[1]=='mean' else False for a in g.index.tolist()]
p.loc[idx,:]
It works but there must be a better way to do this. Is there a better way to access the second level multiIndex?
Upvotes: 2
Views: 3908
Reputation: 1827
You should read over the documentation on multi-index data frames. IndexSlice is the way to handle this. Something like this should work.
import pandas as pd
idx = pd.IndexSlice
g.loc[idx[:,"mean"],:]
Upvotes: 3
Reputation: 7994
You can also do this
g.loc[:, ("B", "mean")]
A
0.0 0.381882
1.0 0.450356
2.0 0.497692
Name: (B, mean), dtype: float64
Check out advanced indexing with hierarchical index
Upvotes: 1
Reputation: 1161
I found a couple of easy answers:
g.xs('mean', level=1)
Another one:
idx = pd.IndexSlice
g.loc[idx[:,'mean'],:]
Upvotes: 4
Reputation: 57033
You can swap the order of indexes in the multiindex:
g.reorder_levels([1,0]).loc['mean']
# B
#A
#0.0 0.515745
#1.0 0.451534
#2.0 0.483014
Upvotes: 1