Reputation: 2014
My dataframe contains the index column (Date) and hundreds of columns.
Date (index) | Menu1 | Menu2 | Menu3 | Menu4 |
---|---|---|---|---|
2023_1_1_x | NaN | 5 | 10 | NaN |
2023_1_2_x | NaN | 8 | 9 | NaN |
2023_1_3_x | NaN | 10 | 8 | NaN |
2023_1_1_y | 10 | 5 | NaN | NaN |
2023_1_2_y | 11 | 8 | NaN | NaN |
2023_1_3_y | 12 | 10 | NaN | NaN |
The Date index contains duplicates. That's why they end with x or y.
The dataframe has hundreds of columns. I need to select the columns like Menu3 whose index contains x and the values exist for only those x indexes.
Upvotes: 0
Views: 33
Reputation: 260790
You can group by columns ending in _x or not, then check the NA status and ensure only the _x ones have non-NA:
out = (df.notna()
.groupby(df.index.str.endswith('_x'))
.any()
.pipe(lambda d: d.columns[d.eq(d.index, axis=0).all()])
.tolist()
)
Output: ['Menu3']
Upvotes: 2