Reputation: 55
I am looking to replace column header in one level with a condition on another level. Specifically, if the label of second level column index is 'x', I want to change the first level column index to 'test'. The input and goal output is below along with a mask I have to select the columns that I do want to change. But I cannot rename in them inplace. Any advice would be nice. Thank you
# data I have
idx = pd.MultiIndex.from_tuples(list(zip(*[['18', '13', '42','25'], ['a','b','c','d'],['x','y','z','x']])))
orig_dat = pd.DataFrame (([0, 2, 3,4],[0, 2, 3, 4]), columns = idx)
# desired output
idx = pd.MultiIndex.from_tuples(list(zip(*[['18', '13', '42','25'], ['test','b','c','test'],['x','y','z','x']])))
goal_dat = pd.DataFrame (([0, 2, 3,4],[0, 2, 3, 4]), columns = idx)
# mask to select columns that need changing
mask = orig_dat.loc[:,orig_dat.columns.get_level_values(level=2).str.contains('x')].columns.get_level_values(level=1)
Upvotes: 1
Views: 783
Reputation: 34086
You can use list comprehension
:
# Iterate the orig_dat columns, check if the last element is 'x', then create a new tuple with 2nd element as 'test', otherwise leave as is
In [321]: new_cols = [(i[0], 'test', i[-1]) if i[-1] == 'x' else i for i in orig_dat.columns ]
# Create a MultiIndex object from above list and assign it to columns of orig_dat
In [330]: orig_dat.columns = pd.MultiIndex.from_tuples(new_cols)
In [331]: orig_dat
Out[331]:
18 13 42 25
test b c test
x y z x
0 0 2 3 4
1 0 2 3 4
EDIT: After OP's comment to check if x
is contained:
In [338]: new_cols = [(i[0], 'test', i[-1]) if 'x' in i else i for i in orig_dat.columns]
In [330]: orig_dat.columns = pd.MultiIndex.from_tuples(new_cols)
In [331]: orig_dat
Out[331]:
18 13 42 25
test b c test
x y z x
0 0 2 3 4
1 0 2 3 4
Upvotes: 1