Dropping redundant levels from a pandas multiindex

Question

I have a Pandas data frame with a multiindex that is filtered (interactively). The resulting filtered frame have redundant levels in the index where all entries are the same for all entries.

Is there a way to drop these levels from the index?

Having a data frame like:

>>> df = pd.DataFrame([[1,2],[3,4]], columns=["a", "b"], index=pd.MultiIndex.from_tuples([("a", "b", "c"), ("d", "b", "e")], names=["one", "two", "three"]))
>>> df
               a  b
one two three
a   b   c      1  2
d   b   e      3  4

I would like to drop level "two" but without specifying the level since I wouldn't know beforehand which level is redundant.

Something like (made up function...)

>>> df.index = df.index.drop_redundant()
>>> df
           a  b
one three
a   c      1  2
d   e      3  4

Code Different · Accepted Answer

You can convert the index to a dataframe, then count the unique number of values per level. Levels with nunique == 1 will then be dropped:

nunique = df.index.to_frame().nunique()
to_drop = nunique.index[nunique == 1]
df = df.droplevel(to_drop)

If you do this a lot, you can monkey-patch it to the DataFrame class:

def drop_redundant(df: pd.DataFrame, inplace=False):
    if not isinstance(df.index, pd.MultiIndex):
        return df

    nunique = df.index.to_frame().nunique()
    to_drop = nunique.index[nunique == 1]

    return df.set_index(df.index.droplevel(to_drop), inplace=inplace)

# The monkey patching
pd.DataFrame.drop_redundant = drop_redundant

# Usage
df = df.drop_redundant()        # chaining
df.drop_redundant(inplace=True) # in-place

Dropping redundant levels from a pandas multiindex

Answers (2)

Related Questions