Reputation: 123
I have a column with following data :
df['Exp'] = ['10+ years', '8 years', '6 years', '7 years', '5 years','1 year', '< 1 year', '4 years', '3 years', '2 years', '9 years']
I need to convert this column to int format.
How can I do it?
Thanks !
Upvotes: 0
Views: 821
Reputation: 11
This should do the trick:
df.Exp.str.extract('(\d{1,})').astype(int)
For clarity the \d
grabs any numeric string and the {1,}
ensures there is at least one.
EDIT: (Sorry didn't read the question right) To convert it you could do:
df['Exp'] = df.Exp.str.extract('(\d{1,})').astype(int)
Assuming you want empty rows filled with minus one then you could do:
df['Exp'] = df.Exp.str.extract('(\d{1,})').fillna(-1).astype(int)
Upvotes: 1
Reputation: 16172
import pandas as pd
df = pd.DataFrame({'Exp': ['10+ years', '8 years', '6 years', '7 years', '5 years','1 year', '< 1 year', '4 years', '3 years', '2 years', '9 years']})
df['Exp'] = df['Exp'].replace('\D','', regex=True).astype(int)
Output
Exp
0 10
1 8
2 6
3 7
4 5
5 1
6 1
7 4
8 3
9 2
10 9
Upvotes: 1