Uppercase precedence in pandas column name assignment

Question

Is there a reason why column name assignment in Pandas favours Uppercase over lower case?

Example:

dframe = DataFrame({'city':['Alma','Brian Head', 'Fox Park'],
                    'altitude':[3158,3000,2762]})

returns a DataFrame with columns in the order altitude, city.

Whereas:

dframe = DataFrame({'City':['Alma','Brian Head', 'Fox Park'],
                    'altitude':[3158,3000,2762]})

returns a DataFrame with columns in the order City,altitude.

Is this pandas specific or general python behaviour?

JohnE · Accepted Answer

You didn't actually ask this, but I'm assuming there is an implied question about how to preserve the original ordering? If so, here are three ways:

1) Same basic dictionary constructor, but wrap in collections.OrderedDict (thanks to @shx2 for the correction):

from collections import OrderedDict
df1 = pd.DataFrame( OrderedDict([ ('city',['Alma','Brian Head', 'Fox Park']),
                                  ('altitude',[3158,3000,2762]) ]))

2) Non-dictionary constructor where you specifiy data array and column names separately, however, this essentially requires a row-centric entry rather than column-centric as with the dictionary constructor:

lst = [['Alma','Brian Head','Fox Park'],
       [3158,3000,2762]]
df2 = pd.DataFrame( map(list, zip(*lst)),
                    columns = ['city','altitude'] )

3) Simplest way is probably just to specify the order after you create the dataframe (thanks to @EdChum for catching the error in the original post):

df3 = df[['city','altitude']]

Test that results are the same for all three:

In [149]: all(df1==df2)
Out[149]: True

In [150]: all(df1==df3)
Out[150]: True

Uppercase precedence in pandas column name assignment

Answers (1)

Related Questions