Reputation: 1553
A have heard somewhere that is possible to pass a yaml file to python script to rename columns in pandas dataframe. But I have no idea how to do that and not sure if I found anything useful.
For example, yaml:
mappings:
new_column_name1: [old_name_1, old_name_2, old_name_3, old_name_4],
new_columns_name2: [old_name_5, old_name_6, old_name_7, old_name_8]
df:
old_name1 old_name_6
1 4
3 6
6 31
Is it possible to use yaml file like this to rename columns (each column name appear in list [old_name_1, old_name_2, old_name_3, old_name_4]
rename to new_column_name1
) and what is the best way for that?
I know I didn't provide any code I have tried but I really have no idea. Also, any other suggestion about good practice for renaming a huge number of columns in multiple dataframes is welcomed.
Upvotes: 1
Views: 1107
Reputation: 76386
Your example doesn't seem to be legal YAML. Rather, it should be something like:
mappings:
new_column_name1:
- old_name_1
- old_name_2
- old_name_3
- old_name_4
and so on.
In any case, if you install pyaml
, you can use something like:
from pyaml import yaml
d = yaml.load(open('foo.yaml', 'r'))['mappings']
cols = []
for c in df.columns:
cols.append(c)
for k, v in d.items():
if c in v:
cols[-1] = k
break
df.columns = cols
Upvotes: 1