bismo
bismo

Reputation: 1439

How to Split the Contents of a Single Pandas Dataframe Column into Multiple New Columns

Let's say I have data of the records of two sports teams.

d = {'Team': [1, 2], 'Record': ['5-0', '3-2']}
df = pd.DataFrame(data=d)
df

    Team    Record
0     1      5-0
1     2      3-2

I want to split and append the record into new columns for each team like so:

d = {'Team': [1, 2], 'Record': ['5-0', '3-2'], 'Wins': [5, 3], 'Losses' : [0, 2]}
df = pd.DataFrame(data=d)
df


  Team  Record  Wins    Losses
0   1    5-0     5         0
1   2    3-2     3         2

The data in the Records column is an object, due to the hyphen.

(df.dtypes)

Team       int64
Record    object
dtype: object

How would I go about doing this? Would it be some sort of regex, and then a list comprehension to go through each row of the Record column? I would like to do this for around 400 teams in a dataframe. Thanks in advance for the help.

Upvotes: 1

Views: 73

Answers (2)

ALollz
ALollz

Reputation: 59549

If the data are well behaved enough you can use Series.str.extract with named capture groups (Number)-(Number). Then concat back.

pd.concat([df, df['Record'].str.extract(r'(?P<Wins>\d)-(?P<Losses>\d)')], axis=1)
#   Team Record Wins Losses
#0     1    5-0    5      0
#1     2    3-2    3      2

Upvotes: 1

sammywemmy
sammywemmy

Reputation: 28699

use pandas string methods, specifically, str split and str get, and assign the extracts to new columns :

    df = (df
         .assign(Wins= df.Record.str.split('-').str.get(0),
                 Losses = df.Record.str.split('-').str.get(-1)
            )
          )

df


  Team  Record  Wins    Losses
0   1    5-0     5         0
1   2    3-2     3         2

Upvotes: 1

Related Questions