Rajarshi Bhadra
Rajarshi Bhadra

Reputation: 1944

Float to integer for column name in pandas

I have a pandas data frame which looks like this

    Data Source   World Development Indicators  Unnamed: 2                         Unnamed: 3        Unnamed: 4        Unnamed: 5
    Country Name         Country Code         Indicator Name                     Indicator Code     1.960000e+03      1.961000e+03  
    Aruba                    ABW         GDP at market prices (constant 2010 US$)   NY.GDP.MKTP.KD           NaN             NaN    

To convert the first row to its column I am using the code

data.columns = data.iloc[0]

As a result the data data frame gets modified into

Country Name    Country Code    Indicator Name  Indicator Code     1960.0         1961.0        1962.0
Country Name    Country Code    Indicator Name  Indicator Code  1.960000e+03    1.961000e+03
Aruba   ABW GDP at market prices (constant 2010 US$)    NY.GDP.MKTP.KD  NaN           NaN

Now my main problem is for columns with years as headers iam getting 1960.0 which I want to be a sintegers ie 1960. Any help on this will be greatly appreciated

Upvotes: 4

Views: 2626

Answers (2)

jezrael
jezrael

Reputation: 863741

Another possible solutions are add parameters skiprows or header to read_csv, if create DataFrame from csv:

import pandas as pd
import numpy as np
from pandas.compat import StringIO

temp=u"""Data Source;World Development Indicators;Unnamed: 2;Unnamed: 3;Unnamed: 4;Unnamed: 5
Country Name;Country Code;Indicator Name;Indicator Code;1960;1961
Aruba;ABW;GDP at market prices (constant 2010 US$);NY.GDP.MKTP.KD;NaN;NaN"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", skiprows=1)
print (df)
  Country Name Country Code                            Indicator Name  \
0        Aruba          ABW  GDP at market prices (constant 2010 US$)   

   Indicator Code  1960  1961  
0  NY.GDP.MKTP.KD   NaN   NaN 

df = pd.read_csv(StringIO(temp), sep=";", header=1)
print (df)
  Country Name Country Code                            Indicator Name  \
0        Aruba          ABW  GDP at market prices (constant 2010 US$)   

   Indicator Code  1960  1961  
0  NY.GDP.MKTP.KD   NaN   NaN  

If it is not possible, check perfect MaxU solution and add df = df[1:] for remove first row from data.

Upvotes: 1

piRSquared
piRSquared

Reputation: 294546

option 1

def rn(x):
    try:
        return '{:0.0f}'.format(x)
    except:
        return x

df.T.set_index(0).rename_axis(rn).T

enter image description here

Upvotes: 1

Related Questions