ishan
ishan

Reputation: 87

problem in setting index in pandas DataFrame

I'm setting index for the given code as countries name but by using dataframe.set_index(index_name). I'm unable to update the index of the dataframe. I'm currently working on Python 3.7, why this code is not setting up the index?

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Country':['Nigeria','Bangladesh','China'],
                    'population':[89765,98744,654520],
                    'Birth_Rate':[23.54,34.43,20.3],
                    'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
                    columns = ['Country','population','Birth_Rate','Update_Date'])                                                        

df2 = pd.DataFrame({'Country':['India','Sri Lanka','Dubai'],
                    'population':[98343,2453,57432],
                    'Birth_Rate':[33.54,44.44,23.3],
                    'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
                    columns =['Country','population','Birth_Rate','Update_Date']) 

df3 = df2.append(df1)
df3.set_index('Country')
print(df3)

I'm expecting this as output:

      Country   population  Birth_Rate Update_Date
0       India       98343       33.54  2016-01-18
1   Sri Lanka        2453       44.44  2016-02-15
2       Dubai       57432       23.30  2016-02-03
0     Nigeria       89765       23.54  2016-01-18
1  Bangladesh       98744       34.43  2016-02-15
2       China      654520       20.30  2016-02-03

but actual output is:
            population  Birth_Rate   Update_Date
Country    
 India       98343       33.54      2016-01-18                               
 Sri Lanka   2453        44.44      2016-02-15
 Dubai       57432       23.30      2016-02-03
 Nigeria     89765       23.54      2016-01-18
 Bangladesh  98744       34.43      2016-02-15
 China       654520      20.30      2016-02-03

Upvotes: 1

Views: 4261

Answers (1)

DirtyBit
DirtyBit

Reputation: 16772

To set the DataFrame index (row labels) using one or more existing columns.

You can use the DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

Where the parameters define:

keys : column label or list of column labels / arrays drop : boolean, default True

Delete columns to be used as the new index

append : boolean, default False

Whether to append columns to existing index

inplace : boolean, default False

Modify the DataFrame in place (do not create a new object)

verify_integrity : boolean, default False

Check the new index for duplicates. Otherwise defer the check until necessary. Setting to False will improve the performance of this method

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Country':['Nigeria','Bangladesh','China'],
                    'population':[89765,98744,654520],
                    'Birth_Rate':[23.54,34.43,20.3],
                    'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
                    columns = ['Country','population','Birth_Rate','Update_Date'])                                                        

df2 = pd.DataFrame({'Country':['India','Sri Lanka','Dubai'],
                    'population':[98343,2453,57432],
                    'Birth_Rate':[33.54,44.44,23.3],
                    'Update_Date':['2016-01-18','2016-02-15','2016-02-03']},
                    columns =['Country','population','Birth_Rate','Update_Date']) 

df3 = df2.append(df1)
df3.set_index('Country', inplace = True, 
                            append = True, drop = True)
print(df3)

OUTPUT:

enter image description here

Upvotes: 2

Related Questions