Reputation: 189
The below column in data frame needs to be converted to int:
dsAttendEnroll.District.head()
0 DISTRICT 01
1 DISTRICT 02
2 DISTRICT 03
3 DISTRICT 04
4 DISTRICT 05
Name: District, dtype: object
Using astype gives the below error, how can this be done ?
dsAttendEnroll.District = dsAttendEnroll.District.map(lambda x: x[-2:]).astype(int)
ValueError: invalid literal for long() with base 10: 'LS'
Upvotes: 2
Views: 2269
Reputation: 863431
You can use split
with selecting second lists by str[1]
with to_numeric
, where is parameter errors='coerce'
- it convert not numeric values to NaN
:
print (df)
District
0 DISTRICT 01
1 DISTRICT 02
2 DISTRICT 03
3 DISTRICT 04
4 DISTRICT 05
5 DISTRICT LS
print (df.District.str.split().str[1])
0 01
1 02
2 03
3 04
4 05
5 LS
Name: District, dtype: object
print (pd.to_numeric(df.District.str.split().str[1], errors='coerce'))
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 NaN
Name: District, dtype: float64
Another solution with slice 2 last chars:
print (df.District.str[-2:])
0 01
1 02
2 03
3 04
4 05
5 LS
Name: District, dtype: object
print (pd.to_numeric(df.District.str[-2:], errors='coerce'))
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 NaN
Name: District, dtype: float64
Upvotes: 3