Tobias
Tobias

Reputation: 176

Drop all rows of higher level multi-index if certain lower level index doesn't exist

Suppose I have a DataFrame with two MultiIndex levels, such as:

     col1
A x     1
  y     2
B x     3
  y     4
C x     5

Now, if an index of the level 0 lacks the lower level index 'x' or 'y', I want to drop all rows associated with this index. So in this example I want to drop all rows of 'C' because there is no 'y' under 'C'. So the result should be:

     col1
A x     1
  y     2
B x     3
  y     4

Is there a nice/clean way to do this?

Upvotes: 2

Views: 193

Answers (2)

Anurag Dabas
Anurag Dabas

Reputation: 24314

Try:

mask=df.unstack().isna().any(1)       
#created a mask to check if an index has a missing value or not

Finally:

df=df.loc[~df.index.get_level_values(0).isin(mask[mask].index)]
                                               #^getting index of where value is missing
                      #^excluding that value from the level 0

output of df:

        col1
A   x   1
    y   2
B   x   3
    y   4

Upvotes: 2

BENY
BENY

Reputation: 323226

Try with transform

out = df[df.groupby(level=0)['col1'].transform('count')==2]

Upvotes: 1

Related Questions