duckman
duckman

Reputation: 747

Create column from a substring of another column

I have a Pandas dataframe object. I want to create new columns from substrings of an existing column. My data looks like this:

    Date        variable                 want1          want2   want3
0   02-01-08    Australia - Sydney - A   Australia      Sydney  A
1   03-01-08    Australia - Sydney - A   Australia      Sydney  A
2   04-01-08    Australia - Sydney - A   Australia      Sydney  A
3   05-01-08    Canada - Toronto - B     Canada         Toronto B
4   06-01-08    Canada - Toronto - B     Canada         Toronto B

where want1 to want3 are what I need.

Upvotes: 2

Views: 4838

Answers (2)

piRSquared
piRSquared

Reputation: 294258

pd.Series.extract

pat = '(?P<want1>.*) - (?P<want2>.*) - (?P<want3>.*)'
df.join(df.variable.str.extract(pat, expand=True))

       Date                variable      want1    want2 want3
0  02-01-08  Australia - Sydney - A  Australia   Sydney     A
1  03-01-08  Australia - Sydney - A  Australia   Sydney     A
2  04-01-08  Australia - Sydney - A  Australia   Sydney     A
3  05-01-08    Canada - Toronto - B     Canada  Toronto     B
4  06-01-08    Canada - Toronto - B     Canada  Toronto     B

Upvotes: 1

jpp
jpp

Reputation: 164653

You can use pd.Series.str.split for this:

df[['want1', 'want2', 'want3']] = df['variable'].str.split(' - ', expand=True)

Upvotes: 5

Related Questions