Reputation: 303
I've just spent several hours trying to get this to work and I'm starting to think I want the impossible, though I'm pretty sure it can be done. I have a pandas dataframe which has a multiindex header (excel spreadsheet, 3 rows header). I am definitely looking at it, so I know it exists, yet when I try to rename a column according to the official documentation, I'm told the column name can't be found.
The table looks like this:
Test | Test1 | Test2
| abc | xyz | abc | xyz
geo1 | geo2 | geo1 | geo2 | geo1 | geo2
------------------------------------------------
a | x | 1 | 0.5 | 1 | 0.5
b | y | 2 | 0.2 | 2 | 0.2
c | z | 3 | 0.4 | 3 | 0.3
I simply want to change "Test" into "Boom", for example. Test is the first value of the column names in level 0, yet it doesn't work. I used one of these:
df.rename(columns={df.columns[0][0]: 'Boom'}, inplace=True, errors='raise')
df.rename(columns={df.columns[0][0]: 'Boom'}, level=0, inplace=True, errors='raise')
df.rename(columns={df.columns.values[0][0]: 'Boom'}, inplace=True, errors='raise')
Problem is, even if I hard code the column names and level, it still doesn't work! This should do the trick as it works in other scripts of mine (2 levels, not 3):
df.rename(columns={'Test': 'Boom'}, level=0, inplace=True, errors='raise')
The error is funny, as it's telling me it can't find the "Test" column (it's literally telling me it can't find the column it just names...). What am I doing wrong??
Thank you all!
Upvotes: 3
Views: 1295
Reputation: 68176
This combination of parameters works for me:
index = pandas.MultiIndex.from_tuples([('A', 'X'), ('B', 'Y'), ('C', 'Z')], names=['id1', 'id2'])
columns = pandas.MultiIndex.from_tuples([('Test1', 'a', 'x')], names=['col1', 'col2', 'col3'])
df = pandas.DataFrame(
data = [1, 2, 3],
index=index,
columns=columns
)
df.rename(columns={'Test1': 'Boom!'}, level='col1')
which return a new dataframe:
col1 Boom!
col2 a
col3 x
id1 id2
A X 1
B Y 2
C Z 3
Upvotes: 0
Reputation: 303
I just removed errors='raise' from the function and it worked. there's no logic in the way pandas works, but this seems to have done the trick. not sure how something can work, but if you tell it to raise an error if needed, it stops working. thanks all for trying anyway. if someone could explain why this is the way it is, for my own sanity, I'd appreciate it!
Upvotes: 0
Reputation: 4864
df.columns.set_levels(['Boom1','Boom2','Boom3'],level=0,inplace=True)
If your columns are Boom1, ..., Boom1000, first create a list of names by
ll = [f"Boom{i}" for i in range(1,1001)]
df.columns.set_levels(ll,level=0,inplace=True)
Upvotes: 1