Reputation: 747
I have a Pandas dataframe object. I want to create new columns from substrings of an existing column. My data looks like this:
Date variable want1 want2 want3
0 02-01-08 Australia - Sydney - A Australia Sydney A
1 03-01-08 Australia - Sydney - A Australia Sydney A
2 04-01-08 Australia - Sydney - A Australia Sydney A
3 05-01-08 Canada - Toronto - B Canada Toronto B
4 06-01-08 Canada - Toronto - B Canada Toronto B
where want1
to want3
are what I need.
Upvotes: 2
Views: 4838
Reputation: 294258
pd.Series.extract
pat = '(?P<want1>.*) - (?P<want2>.*) - (?P<want3>.*)'
df.join(df.variable.str.extract(pat, expand=True))
Date variable want1 want2 want3
0 02-01-08 Australia - Sydney - A Australia Sydney A
1 03-01-08 Australia - Sydney - A Australia Sydney A
2 04-01-08 Australia - Sydney - A Australia Sydney A
3 05-01-08 Canada - Toronto - B Canada Toronto B
4 06-01-08 Canada - Toronto - B Canada Toronto B
Upvotes: 1
Reputation: 164653
You can use pd.Series.str.split
for this:
df[['want1', 'want2', 'want3']] = df['variable'].str.split(' - ', expand=True)
Upvotes: 5