Srinivas
Srinivas

Reputation: 656

Name of one specific column in Pandas is not changing

I am trying to change the names of a few columns in my dataframe. The below code is able to change the names of all columns, excepting one. There are no white spaces before or after the name of the misbehaving column ('Tot Cases/1M pop'). I am unable to figure out what is the problem. Appreciate suggestions.

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 216 entries, 0 to 215
Data columns (total 12 columns):
Country,Other       216 non-null object
TotalCases          216 non-null int64
NewCases            139 non-null object
TotalDeaths         178 non-null float64
NewDeaths           91 non-null object
TotalRecovered      207 non-null float64
ActiveCases         216 non-null int64
Serious,Critical    137 non-null float64
Tot Cases/1M pop    214 non-null float64
Deaths/1M pop       176 non-null float64
TotalTests          179 non-null float64
Tests/ 1M pop       179 non-null float64
dtypes: float64(7), int64(2), object(3)
memory usage: 20.4+ KB

df = df.rename(columns={'Country,Other': 'Country_or_Other','Serious,Critical': 'Serious_or_Critical','Tot Cases/1M pop':'Cases_1M_pop', 'Deaths/1M pop':'Deaths_per_1M_pop','Tests/ 1M pop':'Tests_per_1M_pop'})

df.head(3)
    Country_or_Other    TotalCases  NewCases    TotalDeaths     NewDeaths   TotalRecovered  ActiveCases     Serious_or_Critical     Tot Cases/1M pop    Deaths_per_1M_pop   TotalTests  Tests_per_1M_pop
0   World   3481349     83255.0     244663.0    5215.0  1120908.0   2115778     50860.0     447.0   31.4    NaN     NaN
1   China   82875   1.0     4633.0  NaN     77685.0     557     37.0    58.0    3.0     NaN     NaN
2   USA     1160774     29744.0     67444.0     1691.0  173318.0    920012  16475.0     3507.0  204.0   6931132.0   20940.0

for col in df.columns:
    print(col, len(col))
Country_or_Other 16
TotalCases 10
NewCases 8
TotalDeaths 11
NewDeaths 9
TotalRecovered 14
ActiveCases 11
Serious_or_Critical 19
Tot Cases/1M pop 16
Deaths_per_1M_pop 17
TotalTests 10
Tests_per_1M_pop 16

print (df.columns.tolist()) 
['Country_or_Other',
 'TotalCases',
 'NewCases',
 'TotalDeaths',
 'NewDeaths',
 'TotalRecovered',
 'ActiveCases',
 'Serious_or_Critical',
 'Tot\xa0Cases/1M pop',
 'Deaths_per_1M_pop',
 'TotalTests',
 'Tests_per_1M_pop']

print([(i, hex(ord(i))) for i in df.columns[8]])
    [('T', '0x54'), ('o', '0x6f'), ('t', '0x74'), ('\xa0', '0xa0'), ('C', '0x43'), ('a', '0x61'), ('s', '0x73'), ('e', '0x65'), ('s', '0x73'), ('/', '0x2f'), ('1', '0x31'), ('M', '0x4d'), (' ', '0x20'), ('p', '0x70'), ('o', '0x6f'), ('p', '0x70')]

Upvotes: 1

Views: 299

Answers (2)

Pete
Pete

Reputation: 130

You could also rename the specific column by addressing the index directly as follows:

df.columns.values[8] = "New name"

Upvotes: 1

jezrael
jezrael

Reputation: 863721

You can check this what is \xa0 value after testing by print (df.columns.tolist()):

\xa0 is actually non-breaking space in Latin1 (ISO 8859-1), also chr(160). You should replace it with a space.

So change problematic column name like:

df = df.rename(columns={'Country,Other': 'Country_or_Other',
                       'Serious,Critical': 'Serious_or_Critical',
                       'Tot\xa0Cases/1M pop':'Cases_1M_pop',
                       'Deaths/1M pop':'Deaths_per_1M_pop',
                       'Tests/ 1M pop':'Tests_per_1M_pop'})

Upvotes: 3

Related Questions