Reputation: 1439
Let's say I have data of the records of two sports teams.
d = {'Team': [1, 2], 'Record': ['5-0', '3-2']}
df = pd.DataFrame(data=d)
df
Team Record
0 1 5-0
1 2 3-2
I want to split and append the record into new columns for each team like so:
d = {'Team': [1, 2], 'Record': ['5-0', '3-2'], 'Wins': [5, 3], 'Losses' : [0, 2]}
df = pd.DataFrame(data=d)
df
Team Record Wins Losses
0 1 5-0 5 0
1 2 3-2 3 2
The data in the Records column is an object, due to the hyphen.
(df.dtypes)
Team int64
Record object
dtype: object
How would I go about doing this? Would it be some sort of regex, and then a list comprehension to go through each row of the Record column? I would like to do this for around 400 teams in a dataframe. Thanks in advance for the help.
Upvotes: 1
Views: 73
Reputation: 59549
If the data are well behaved enough you can use Series.str.extract
with named capture groups (Number)-(Number)
. Then concat back.
pd.concat([df, df['Record'].str.extract(r'(?P<Wins>\d)-(?P<Losses>\d)')], axis=1)
# Team Record Wins Losses
#0 1 5-0 5 0
#1 2 3-2 3 2
Upvotes: 1
Reputation: 28699
use pandas string methods, specifically, str split and str get, and assign the extracts to new columns :
df = (df
.assign(Wins= df.Record.str.split('-').str.get(0),
Losses = df.Record.str.split('-').str.get(-1)
)
)
df
Team Record Wins Losses
0 1 5-0 5 0
1 2 3-2 3 2
Upvotes: 1