Reputation: 663
I have a python dictionary that looks like this:
d = {'comp_1': {'property_4': 24, 'property_2': 45, 'property_3': 124, 'missing': 39, 'property_1': 16},
'comp_2': {'property_4': 23, 'property_2': 49, 'property_3': 126, 'property_1': 16, 'missing': 38},
'comp_3': {'property_4': 24, 'property_2': 43, 'property_1': 19, 'missing': 30, 'property_3': 116}}
When I load it into panda dataframe and try to print it it looks as follows:
df = pd.DataFrame.from_dict(hits, orient='index')
print(df)
Output:
missing property_1 property_2 property_3 property_4
comp_1 39 16 45 124 24
comp_2 38 16 49 126 23
comp_3 30 19 43 116 24
Now, I want to rename the columns, so I try:
df = pd.DataFrame.from_dict(hits, orient='index' columns=reversed(['Missing', 'P1', 'P2', 'P3', 'P4']))
which yields empty data frame (which I assume because these keys do not exist in dictionary?):
Empty DataFrame
Columns: []
Index: []
If I try this instead:
df = pd.DataFrame.from_dict(hits, orient='index')
columns = reversed(['Missing', 'P1', 'P2', 'P3', 'P4'])
df.columns=columns
columns are renamed by order is not preserved, so each time I run the code the number do not correspond to the column, e.g:
P4 P3 P2 P1 Missing
comp_1 16 24 124 45 39
comp_2 16 23 126 49 38
comp_3 19 24 116 43 30
and:
P4 P3 P2 P1 Missing
comp_1 24 16 39 124 45
comp_2 23 16 38 126 49
comp_3 24 19 30 116 43
I'm guessing I need to somehow provide keys from the nested dictioary when I load data into dataframe, but I'm not sure how to do it. Or is it something else I need to do?
Edit: I also tried renaming columns with dictionary as follows:
df.rename({'missing': 'Missing', 'property_1': 'P1', 'property_2': 'P2', 'property_3': 'P3',
'property_4': 'P4'})
but still old names are shown
Upvotes: 2
Views: 8033
Reputation: 36
You can provide a dict to replace the columns and give you 1:1 mapping. Adding the in place flag will do it in the existing dataframe, omitting it will give you a new dataframe with a the new column names.
df.rename(columns = {
'property_1': 'P1',
'property_2': 'P2',
'property_3': 'P3',
'property_4': 'P4',
'missing': 'Missing'
}, inplace = True)
Upvotes: 1
Reputation: 402263
Unfortunately, the columns
parameter in to_dict
only specifies the columns you want to select. For example,
pd.DataFrame.from_dict(hits, orient='index', columns=['property_4'])
property_4
comp_1 24
comp_2 23
comp_3 24
Only selects the "property_4" column, ignoring everything else. Of course, this makes sense because dictionaries inherently do not have ordering. Your only option is to rename the keys or rename the columns using DataFrame.rename()
.
cmap = {'property_1': 'P1', 'property_2': 'P2', 'property_3': 'P3',
'property_4': 'P4', 'missing': 'Missing'}
df = df.rename(columns=cmap)
df
P4 P2 P3 Missing P1
comp_1 24 45 124 39 16
comp_2 23 49 126 38 16
comp_3 24 43 116 30 19
Upvotes: 4