Reputation: 21
Major edit:
Apparently it is difficult to understand my question, so I'll do my best to concretize it.
I got two dataframes, "df1" and "df2". These are quite larger, larger than in the code block below, so I want to automatize the renaming process in order to make the names anonymous.
The first, df1, contains names in its index as:
Yearly energy consumption Size rating
Almtunaskolan 322149.324250 Medium school
Almunge skola 383479.065917 Medium school
Bergaskolan (Videskolan) 296916.405000 Medium school
Danmarks skola 84884.857333 Small school
Domarringens skola 463568.627250 Large school
Ekuddens skola 177668.365000 Small school
In this dataframe each user name (which is in the index) has a size rating in the second column "Size rating". I want to use this rating to rename the user names in another dataframe, df2.
The other, df2, has fewer names as some have been filtered away compared to the indexes in df1. But all column names in df2 exists in the index names in df1. Difference being that here the user names are in the columns instead as shown below for df2:
datetime Almtunaskolan Almunge skola ... Real user name ... Real user name ... \
24 2017-01-02 00:00:00 0.001268 0.000579
25 2017-01-02 01:00:00 0.001257 0.000591
26 2017-01-02 02:00:00 0.001257 0.000583
27 2017-01-02 03:00:00 0.001257 0.000587
28 2017-01-02 04:00:00 0.001268 0.000583
Now to the question: How do I use the "size rating" in df1 to rename the columns in df2 for each user?
For instance, in df1 we have "Almtunaskolan" in the first row, which also is the user in the first column in df2. So I want to rename "Almtunaskolan" in the first columne in df2 to "Medium school", etc.
That is, I want to make it looks like this:
datetime Medium school Medium school ... Small school ... Large school... \
24 2017-01-02 00:00:00 0.001268 0.000579
25 2017-01-02 01:00:00 0.001257 0.000591
26 2017-01-02 02:00:00 0.001257 0.000583
27 2017-01-02 03:00:00 0.001257 0.000587
28 2017-01-02 04:00:00 0.001268 0.000583
Note that the users are fewer in df2, i.e., the number of columns in df2 are fewer than the number of indexes in df1.
How can I achieve this :x I'm far from a pro at Pandas, but things like this is so difficult to even get started with...
I've tried different df.renaming options, or for-loops with fidning df2.columns == df1.index, and some dicts or mapping but I cannot make them work
Upvotes: 0
Views: 41
Reputation: 5430
You can produce a Dictionary from df1 and then use this to change the column names in df2 using map()
as the simplified example below shows:
import pandas as pd
df1= pd.DataFrame({'x': ['a','b'],
'y': [1, 2],
'z': ['name1', 'name2']
})
df1 = df1.set_index('x')
df2 = pd.DataFrame({'a': [1, 2, 3],
'b': [3, 4, 5]
})
print(df1,'\n')
print(df2,'\n')
change = df1['z'].to_dict()
df2.columns = df2.columns.map(change)
print(df2)
which prints
y z
x
a 1 name1
b 2 name2
a b
0 1 3
1 2 4
2 3 5
name1 name2
0 1 3
1 2 4
2 3 5
Upvotes: 1