Reputation: 21
I would like to rename columns in a Pandas dataframe using rename function and therefore I would like to split the name (string) at an uppercase letter within the string. So for example my column names are something like 'FooBar' or 'SpamEggs' and one column is called 'Monty-Python'. My goal are column names like 'foo_bar' 'spam_eggs' and 'monty_python'.
I know that
'-'.join(re.findall('[A-Z][a-z]*', 'FooBar'))
will give me
Foo-Bar
But this cannot be included in my rename function:
df.rename(columns=lambda x: x.strip().lower().replace("-", "_"), inplace=True)
(should go between strip and lower but gives back a Syntax Error).
Can anyone help me to include the snippet to rename or help me find another solution than findall?
Upvotes: 1
Views: 840
Reputation: 402824
_
) to uppercase letters that are not at the start of the stringdf.columns
Index(['FooBar', 'SpamEggs', 'Monty-Python'], dtype='object')
df.columns.str.replace('[\W]', '')\
.str.replace('(?<!^)([A-Z])', r'_\1')\
.str.lower()
Index(['foo_bar', 'spam_eggs', 'monty_python'], dtype='object')
This solution generalises quite nicely. Assign the result back to df.columns
.
Upvotes: 2