G.M
G.M

Reputation: 663

Can't change column names in pandas data

I have a python dictionary that looks like this:

d = {'comp_1': {'property_4': 24, 'property_2': 45, 'property_3': 124, 'missing': 39, 'property_1': 16}, 
 'comp_2': {'property_4': 23, 'property_2': 49, 'property_3': 126, 'property_1': 16, 'missing': 38}, 
 'comp_3': {'property_4': 24, 'property_2': 43, 'property_1': 19, 'missing': 30, 'property_3': 116}}

When I load it into panda dataframe and try to print it it looks as follows:

df = pd.DataFrame.from_dict(hits, orient='index')
print(df)

Output:

        missing  property_1  property_2  property_3  property_4
comp_1       39          16          45         124          24
comp_2       38          16          49         126          23
comp_3       30          19          43         116          24

Now, I want to rename the columns, so I try:

df = pd.DataFrame.from_dict(hits, orient='index' columns=reversed(['Missing', 'P1', 'P2', 'P3', 'P4']))

which yields empty data frame (which I assume because these keys do not exist in dictionary?):

Empty DataFrame
Columns: []
Index: []

If I try this instead:

df = pd.DataFrame.from_dict(hits, orient='index')
columns = reversed(['Missing', 'P1', 'P2', 'P3', 'P4'])
df.columns=columns

columns are renamed by order is not preserved, so each time I run the code the number do not correspond to the column, e.g:

        P4  P3   P2  P1  Missing
comp_1  16  24  124  45       39
comp_2  16  23  126  49       38
comp_3  19  24  116  43       30

and:

        P4  P3  P2   P1  Missing
comp_1  24  16  39  124       45
comp_2  23  16  38  126       49
comp_3  24  19  30  116       43

I'm guessing I need to somehow provide keys from the nested dictioary when I load data into dataframe, but I'm not sure how to do it. Or is it something else I need to do?

Edit: I also tried renaming columns with dictionary as follows:

df.rename({'missing': 'Missing', 'property_1': 'P1', 'property_2': 'P2', 'property_3': 'P3',
           'property_4': 'P4'})

but still old names are shown

Upvotes: 2

Views: 8033

Answers (2)

Subsum44
Subsum44

Reputation: 36

You can provide a dict to replace the columns and give you 1:1 mapping. Adding the in place flag will do it in the existing dataframe, omitting it will give you a new dataframe with a the new column names.

df.rename(columns = {
    'property_1': 'P1',
    'property_2': 'P2',
    'property_3': 'P3',
    'property_4': 'P4',
    'missing': 'Missing'
    }, inplace = True)

Upvotes: 1

cs95
cs95

Reputation: 402263

Unfortunately, the columns parameter in to_dict only specifies the columns you want to select. For example,

pd.DataFrame.from_dict(hits, orient='index', columns=['property_4'])

        property_4
comp_1          24
comp_2          23
comp_3          24

Only selects the "property_4" column, ignoring everything else. Of course, this makes sense because dictionaries inherently do not have ordering. Your only option is to rename the keys or rename the columns using DataFrame.rename().

cmap = {'property_1': 'P1', 'property_2': 'P2', 'property_3': 'P3', 
        'property_4': 'P4', 'missing': 'Missing'}
df = df.rename(columns=cmap)
df

        P4  P2   P3  Missing  P1
comp_1  24  45  124       39  16
comp_2  23  49  126       38  16
comp_3  24  43  116       30  19

Upvotes: 4

Related Questions