olbinado11
olbinado11

Reputation: 161

df.set_index() Not Working as What I Expected

Setting the Index

From the above, you can see that I have set the index to 'index'. My expectation is to be able to use the column 'index' for dropping rows and just use the column 'Barangay' as a feature not as an index of my data frame.

Using Index to Drop records

As seen above, rows are dropped still using the 'Barangay' column as a reference index. I tried dropping using index [0, 1] but returns an error.

Upvotes: 1

Views: 1638

Answers (1)

jezrael
jezrael

Reputation: 862441

You need assign back:

city_prop = city_prop.set_index('index')

Or:

city_prop.set_index('index', inplace = True)

EDIT:

df = pd.read_csv('CityProperEskwenilaExtraIndicators.csv', 
                 skiprows=1,
                 header=None, 
                 sep=';',
                 index_col=[0,1]).T

print (df.head())
0 Barangay    Longitude     Latitude        Poverty rate Terrain type  \
1        #    See annex    See annex Per 100 inhabitants    See annex   
2        1  27,67231183   66,3112793                  18    Difficult   
3        2  65,15620167  53,32027629                  54    Difficult   
4        3  34,94438385   89,7970517                  63    Difficult   
5        4  10,97542641  84,26323733                  42       Normal   
6        5  26,05436012  61,30689679                  70    Difficult   

0 Roads needing repair  Access to WASH Access to clean water  \
1   kilometers of road % of population       % of population   
2          55,40469584            50,2                  71,2   
3          14,08228761            51,8                  88,9   
4          33,20044684              77                  97,4   
5          1,695918463            74,7                  52,1   
6          85,08259271            70,1                  99,3   

0 Violent incidents     Homicides  
1     rate per 100K rate per 100K  
2              7,72   6,833797715  
3               8,3   5,513650409  
4              3,72   2,931838433  
5              6,26   5,883509349  
6              6,55   5,348430398  

#replace ,
df = df.replace(',','.', regex=True)
#remove second level
df.columns = df.columns.droplevel(1)
#convert columns to numeric
excluded = ['Terrain type','Poverty rate']
cols = df.columns.difference(excluded)
#to floats
df[cols] = df[cols].astype(float)
#to integer
df['Poverty rate'] = df['Poverty rate'].astype(int)
print (df.head())
0  Barangay  Longitude   Latitude  Poverty rate Terrain type  \
2       1.0  27.672312  66.311279            18    Difficult   
3       2.0  65.156202  53.320276            54    Difficult   
4       3.0  34.944384  89.797052            63    Difficult   
5       4.0  10.975426  84.263237            42       Normal   
6       5.0  26.054360  61.306897            70    Difficult   

0  Roads needing repair  Access to WASH  Access to clean water  \
2             55.404696            50.2                   71.2   
3             14.082288            51.8                   88.9   
4             33.200447            77.0                   97.4   
5              1.695918            74.7                   52.1   
6             85.082593            70.1                   99.3   

0  Violent incidents  Homicides  
2               7.72   6.833798  
3               8.30   5.513650  
4               3.72   2.931838  
5               6.26   5.883509  
6               6.55   5.348430

print (df.dtypes)
0
Barangay                 float64
Longitude                float64
Latitude                 float64
Poverty rate               int32
Terrain type              object
Roads needing repair     float64
Access to WASH           float64
Access to clean water    float64
Violent incidents        float64
Homicides                float64
dtype: object

Upvotes: 2

Related Questions