Renaming dataframe column in Python with a string value in another dataframe by matching column/index names

Question

Major edit:

Apparently it is difficult to understand my question, so I'll do my best to concretize it.

I got two dataframes, "df1" and "df2". These are quite larger, larger than in the code block below, so I want to automatize the renaming process in order to make the names anonymous.

The first, df1, contains names in its index as:

                                 Yearly energy consumption    Size rating
Almtunaskolan                                322149.324250  Medium school
Almunge skola                                383479.065917  Medium school
Bergaskolan (Videskolan)                     296916.405000  Medium school
Danmarks skola                                84884.857333   Small school
Domarringens skola                           463568.627250   Large school
Ekuddens skola                               177668.365000   Small school

In this dataframe each user name (which is in the index) has a size rating in the second column "Size rating". I want to use this rating to rename the user names in another dataframe, df2.

The other, df2, has fewer names as some have been filtered away compared to the indexes in df1. But all column names in df2 exists in the index names in df1. Difference being that here the user names are in the columns instead as shown below for df2:

                 datetime  Almtunaskolan  Almunge skola  ... Real user name ... Real user name ... \
24    2017-01-02 00:00:00       0.001268       0.000579   
25    2017-01-02 01:00:00       0.001257       0.000591   
26    2017-01-02 02:00:00       0.001257       0.000583   
27    2017-01-02 03:00:00       0.001257       0.000587   
28    2017-01-02 04:00:00       0.001268       0.000583

Now to the question: How do I use the "size rating" in df1 to rename the columns in df2 for each user?

For instance, in df1 we have "Almtunaskolan" in the first row, which also is the user in the first column in df2. So I want to rename "Almtunaskolan" in the first columne in df2 to "Medium school", etc.

That is, I want to make it looks like this:

                 datetime  Medium school  Medium school  ... Small school ... Large school... \
24    2017-01-02 00:00:00       0.001268       0.000579   
25    2017-01-02 01:00:00       0.001257       0.000591   
26    2017-01-02 02:00:00       0.001257       0.000583   
27    2017-01-02 03:00:00       0.001257       0.000587   
28    2017-01-02 04:00:00       0.001268       0.000583

Note that the users are fewer in df2, i.e., the number of columns in df2 are fewer than the number of indexes in df1.

How can I achieve this :x I'm far from a pro at Pandas, but things like this is so difficult to even get started with...

I've tried different df.renaming options, or for-loops with fidning df2.columns == df1.index, and some dicts or mapping but I cannot make them work

user19077881 · Accepted Answer

You can produce a Dictionary from df1 and then use this to change the column names in df2 using map() as the simplified example below shows:

import pandas as pd

df1= pd.DataFrame({'x': ['a','b'],
                 'y': [1, 2],
                 'z': ['name1', 'name2']
                  })

df1 = df1.set_index('x')

df2 = pd.DataFrame({'a': [1, 2, 3],
                   'b': [3, 4, 5]
                   })
print(df1,'
')
print(df2,'
')

change = df1['z'].to_dict()
df2.columns = df2.columns.map(change)

print(df2)

which prints

   y      z
x          
a  1  name1
b  2  name2 

   a  b
0  1  3
1  2  4
2  3  5 

   name1  name2
0      1      3
1      2      4
2      3      5

Renaming dataframe column in Python with a string value in another dataframe by matching column/index names

Answers (1)

Related Questions