cs95
cs95

Reputation: 402523

Filtering rows by a particular index level in a MultiIndex dataframe

Given a multIndex dataframe:

mux = pd.MultiIndex.from_arrays([
    list('aaaabbbbbccdddddd'),
    list('tuvwlmnopxyfghijk')
], names=['one', 'two'])

df = pd.DataFrame({'col': np.arange(len(mux))}, mux)

df

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8
c   x      9
    y     10
d   f     11
    g     12
    h     13
    i     14
    j     15
    k     16

Is it possible to keep rows corresponding to upto the ith value of the 0th level of the dataframe?

For i = 2, my expected output is:

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8

Note that only rows pertaining to a and b are retained, everything else is dropped. I hope the problem is clear, but if it isn't, please feel free to ask for clarifications.

I tried:

idx = pd.IndexSlice
df.iloc[(idx[:2], slice(None))]

But that gives me only the first two rows in the entire df, not all rows of the first two values in the 0th level.

Upvotes: 3

Views: 1758

Answers (1)

johnchase
johnchase

Reputation: 13705

One way to go about this is to return the index values for the 0th level and then index into the original data frame with those:

df.loc[df.index.levels[0][:2].values]

         col
one two     
a   t      0
    u      1
    v      2
    w      3
b   l      4
    m      5
    n      6
    o      7
    p      8

As mentioned in the comments this only works for the 0th level and not the 1st. There may be a more a generalizable solution that would work with other levels.

Upvotes: 1

Related Questions