Reputation: 794
I have a column like this:
Age
15-20 years old
20-25 years old
I want this as output:
Age_Min Age_Max
15 20
20 25
I am trying to use str.strip()
but no success so far.
I tried d[['Age_Min','Age_Max']]=d['Age'].str.split('-',expand=True)
and the result is almost there. Is there a way to get only the integers and remove the string?
Any tips?
Upvotes: 0
Views: 47
Reputation: 34056
Use Series.str.split
with expand=True
:
In [858]: out = df['Age'].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})
In [860]: out['Age_Max'] = out['Age_Max'].str.split().str[0]
In [861]: out
Out[861]:
Age_Min Age_Max
0 15 20
1 20 25
OR using regex
:
In [870]: out = df['Age'].str.extract("(\d*\-?\d+)")[0].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})
In [871]: out
Out[871]:
Age_Min Age_Max
0 15 20
1 20 25
Upvotes: 2