merchmallow
merchmallow

Reputation: 794

Stripping ints from a string in pandas column

I have a column like this:

      Age
15-20 years old
20-25 years old

I want this as output:

Age_Min   Age_Max
  15         20
  20         25

I am trying to use str.strip() but no success so far.

I tried d[['Age_Min','Age_Max']]=d['Age'].str.split('-',expand=True)

and the result is almost there. Is there a way to get only the integers and remove the string?

Any tips?

Upvotes: 0

Views: 47

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34056

Use Series.str.split with expand=True:

In [858]: out = df['Age'].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})
    
In [860]: out['Age_Max'] = out['Age_Max'].str.split().str[0]

In [861]: out
Out[861]: 
  Age_Min Age_Max
0      15      20
1      20      25

OR using regex:

In [870]: out = df['Age'].str.extract("(\d*\-?\d+)")[0].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})

In [871]: out
Out[871]: 
  Age_Min Age_Max
0      15      20
1      20      25

Upvotes: 2

Related Questions