Reputation: 87
I'm setting index for the given code as countries name but by using dataframe.set_index(index_name). I'm unable to update the index of the dataframe. I'm currently working on Python 3.7, why this code is not setting up the index?
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Country':['Nigeria','Bangladesh','China'],
'population':[89765,98744,654520],
'Birth_Rate':[23.54,34.43,20.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns = ['Country','population','Birth_Rate','Update_Date'])
df2 = pd.DataFrame({'Country':['India','Sri Lanka','Dubai'],
'population':[98343,2453,57432],
'Birth_Rate':[33.54,44.44,23.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns =['Country','population','Birth_Rate','Update_Date'])
df3 = df2.append(df1)
df3.set_index('Country')
print(df3)
I'm expecting this as output:
Country population Birth_Rate Update_Date
0 India 98343 33.54 2016-01-18
1 Sri Lanka 2453 44.44 2016-02-15
2 Dubai 57432 23.30 2016-02-03
0 Nigeria 89765 23.54 2016-01-18
1 Bangladesh 98744 34.43 2016-02-15
2 China 654520 20.30 2016-02-03
but actual output is:
population Birth_Rate Update_Date
Country
India 98343 33.54 2016-01-18
Sri Lanka 2453 44.44 2016-02-15
Dubai 57432 23.30 2016-02-03
Nigeria 89765 23.54 2016-01-18
Bangladesh 98744 34.43 2016-02-15
China 654520 20.30 2016-02-03
Upvotes: 1
Views: 4261
Reputation: 16772
To set the DataFrame index (row labels) using one or more existing columns.
You can use the DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)
Where the parameters define:
keys : column label or list of column labels / arrays drop : boolean, default True
Delete columns to be used as the new index
append : boolean, default False
Whether to append columns to existing index
inplace : boolean, default False
Modify the DataFrame in place (do not create a new object)
verify_integrity : boolean, default False
Check the new index for duplicates. Otherwise defer the check until necessary. Setting to False will improve the performance of this method
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Country':['Nigeria','Bangladesh','China'],
'population':[89765,98744,654520],
'Birth_Rate':[23.54,34.43,20.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns = ['Country','population','Birth_Rate','Update_Date'])
df2 = pd.DataFrame({'Country':['India','Sri Lanka','Dubai'],
'population':[98343,2453,57432],
'Birth_Rate':[33.54,44.44,23.3],
'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
columns =['Country','population','Birth_Rate','Update_Date'])
df3 = df2.append(df1)
df3.set_index('Country', inplace = True,
append = True, drop = True)
print(df3)
OUTPUT:
Upvotes: 2