Reputation: 93
I have a Pandas data frame with a multiindex that is filtered (interactively). The resulting filtered frame have redundant levels in the index where all entries are the same for all entries.
Is there a way to drop these levels from the index?
Having a data frame like:
>>> df = pd.DataFrame([[1,2],[3,4]], columns=["a", "b"], index=pd.MultiIndex.from_tuples([("a", "b", "c"), ("d", "b", "e")], names=["one", "two", "three"]))
>>> df
a b
one two three
a b c 1 2
d b e 3 4
I would like to drop level "two" but without specifying the level since I wouldn't know beforehand which level is redundant.
Something like (made up function...)
>>> df.index = df.index.drop_redundant()
>>> df
a b
one three
a c 1 2
d e 3 4
Upvotes: 2
Views: 57
Reputation: 25353
Another possible solution, which is based on janitor.drop_constant_columns
:
# pip install pyjanitor
import janitor
df.index = pd.MultiIndex.from_frame(
janitor.drop_constant_columns(df.index.to_frame()))
Output:
a b
one three
a c 1 2
d e 3 4
Upvotes: 1
Reputation: 93181
You can convert the index to a dataframe, then count the unique number of values per level. Levels with nunique == 1
will then be dropped:
nunique = df.index.to_frame().nunique()
to_drop = nunique.index[nunique == 1]
df = df.droplevel(to_drop)
If you do this a lot, you can monkey-patch it to the DataFrame
class:
def drop_redundant(df: pd.DataFrame, inplace=False):
if not isinstance(df.index, pd.MultiIndex):
return df
nunique = df.index.to_frame().nunique()
to_drop = nunique.index[nunique == 1]
return df.set_index(df.index.droplevel(to_drop), inplace=inplace)
# The monkey patching
pd.DataFrame.drop_redundant = drop_redundant
# Usage
df = df.drop_redundant() # chaining
df.drop_redundant(inplace=True) # in-place
Upvotes: 1