Reputation: 1944
I have a pandas data frame which looks like this
Data Source World Development Indicators Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5
Country Name Country Code Indicator Name Indicator Code 1.960000e+03 1.961000e+03
Aruba ABW GDP at market prices (constant 2010 US$) NY.GDP.MKTP.KD NaN NaN
To convert the first row to its column I am using the code
data.columns = data.iloc[0]
As a result the data data frame gets modified into
Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0
Country Name Country Code Indicator Name Indicator Code 1.960000e+03 1.961000e+03
Aruba ABW GDP at market prices (constant 2010 US$) NY.GDP.MKTP.KD NaN NaN
Now my main problem is for columns with years as headers iam getting 1960.0 which I want to be a sintegers ie 1960. Any help on this will be greatly appreciated
Upvotes: 4
Views: 2626
Reputation: 863741
Another possible solutions are add parameters skiprows
or header
to read_csv
, if create DataFrame
from csv
:
import pandas as pd
import numpy as np
from pandas.compat import StringIO
temp=u"""Data Source;World Development Indicators;Unnamed: 2;Unnamed: 3;Unnamed: 4;Unnamed: 5
Country Name;Country Code;Indicator Name;Indicator Code;1960;1961
Aruba;ABW;GDP at market prices (constant 2010 US$);NY.GDP.MKTP.KD;NaN;NaN"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", skiprows=1)
print (df)
Country Name Country Code Indicator Name \
0 Aruba ABW GDP at market prices (constant 2010 US$)
Indicator Code 1960 1961
0 NY.GDP.MKTP.KD NaN NaN
df = pd.read_csv(StringIO(temp), sep=";", header=1)
print (df)
Country Name Country Code Indicator Name \
0 Aruba ABW GDP at market prices (constant 2010 US$)
Indicator Code 1960 1961
0 NY.GDP.MKTP.KD NaN NaN
If it is not possible, check perfect MaxU solution and add df = df[1:]
for remove first row from data.
Upvotes: 1
Reputation: 294546
option 1
def rn(x):
try:
return '{:0.0f}'.format(x)
except:
return x
df.T.set_index(0).rename_axis(rn).T
Upvotes: 1