Luis Valencia
Luis Valencia

Reputation: 33998

Pandas dataframe set index not working correctly

I have 3 datasets which I need to join together by Country name:

merged_df = pd.merge(Energy, GDP, on="Country")
merged_df2 = pd.merge(merged_df, ScimEn, on="Country")
merged_df2.set_index('Country')

The assignment says that I have to:

  1. Select only specific colunmns
  2. Sort by rank
  3. Only take the first 15 rows based on rank.

so I did this:

 df3 = merged_df2[['Country','Rank' ,'Documents', 'Citable documents', 'Citations', 'Self-citations', 'Citations per document', 'H index', 'Energy Supply', 'Energy Supply per Capita', '% Renewable', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]
    df3.set_index('Country')
    df4 = df3[['Country','Rank' ,'Documents', 'Citable documents', 'Citations', 'Self-citations', 'Citations per document', 'H index', 'Energy Supply', 'Energy Supply per Capita', '% Renewable', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]
    
    df4 = df4.sort_values(by=['Rank'], ascending=True)
    df4.set_index('Country')
  
    print(df4.index)

and it prints:

Int64Index([3, 14, 9, 13, 11, 2, 5, 6, 4, 10, 8, 12, 7, 0, 1], dtype='int64')

but it should print 1,1,2,3,4..15

what am I missing?

Upvotes: 0

Views: 192

Answers (2)

ZacLanghorne
ZacLanghorne

Reputation: 118

You need to set the inplace argument to true.

df3.set_index('Country',inplace = True)

Upvotes: 1

jezrael
jezrael

Reputation: 862641

You need assign back:

df4 = df4.set_index('Country')

Or:

df4.set_index('Country', inplace=True)

Upvotes: 2

Related Questions