Reputation: 3392
I have several pandas DataFrame's and I want to align their column names, so that all of them have the same names of particular columns (not all columns).
In my real data sets I have many columns, but below I provide a simplified example of 3 DataFrames. All of them have the same content, but it is done to simplify the example.
df1 =
col1 col2 col3
111 123 abc
122 331 zzz
df2 =
colA colB col3
111 123 abc
122 331 zzz
df3 =
col_1 col_2 col3
111 123 abc
122 331 zzz
Then I have the following dictionary that specifies similar columns (in reality the dictionary is bigger):
col_names = {
"col1": ["colA", "col_1"],
"col2": ["colB", "col_2"]
}
It means that the columns colA
and col_1
should be renamed as col1
, and the columns colB
and col_2
should be renamed to col2
.
I know how to rename columns one by one in pandas DataFrame:
df.rename(columns={"colA": "col1"}, inplace=True)
However, I am confused how to use the dictionary to rename columns flexibly?
Upvotes: 2
Views: 4088
Reputation: 16876
df1.rename(columns={col_names[key][0]: key for key in col_names}, inplace=True)
df2.rename(columns={col_names[key][1]: key for key in col_names}, inplace=True)
If the order of values in the dictionary is random and also if you are not sure about columns in data frames then you can use.
df1 = pd.DataFrame({'col1': [1]*3, 'col2': [2]*3, 'col3': [3]*3})
df2 = pd.DataFrame({'colA': [11]*3, 'colB': [22]*3, 'col3': [33]*3})
df3 = pd.DataFrame({'col_1': ['a']*3, 'col_2': ['b']*3, 'col3': ['c']*3})
col_names = {
"col1": ["colA", "col_1"],
"col2": ["colB", "col_2"]
}
cols = {}
for key,value in col_names.items():
for v in value:
cols[v] = key
for df in [df1,df2,df3]:
df.rename(columns=cols, inplace=True)
Upvotes: 4
Reputation: 2032
Try:
df.columns = pd.Series(df.columns.to_list()).replace({'colA':'col1'}).to_list()
Upvotes: 0