MotaBtw
MotaBtw

Reputation: 13

create new columns derived from existing columns pandas

I need to split 1 column value into 3 columns.

df['Campaign name']
0     US_FEMALE_20to30
1        US_MIX_35to45
2     US_FEMALE_20to30
3        US_MIX_35to45
4       US_MALE_30to35
5        US_MIX_35to45

so in the end, it will look like that

   region   gender   age
0   US      FEMALE   20to30
1   US      MIX      35to45
2   US      FEMALE   20to30
3   US      MIX      35to45
4   US      MALE     30to35
5   US      MIX      35to45

thanks a lot

Upvotes: 0

Views: 185

Answers (2)

mozway
mozway

Reputation: 260640

pandas.Series.str.extract with named groups is also a nice alternative:

df['Campaign name'].str.extract('(?P<region>.*)_(?P<gender>.*)_(?P<age>.*)')

output:

  region  gender     age
0     US  FEMALE  20to30
1     US     MIX  35to45
2     US  FEMALE  20to30
3     US     MIX  35to45
4     US    MALE  30to35
5     US     MIX  35to45

Upvotes: 0

Sonia Samipillai
Sonia Samipillai

Reputation: 620

Use the str.split function.

In str.split, you use

  1. specify the delimiter in quotes
  2. the 'n' parameter to specify how many times you want to split
  3. Use 'expand' parameter to expand the columns into new columns

Then you create those columns in df as shown below

# new data frame with split value columns
new = df["Campaign_name"].str.split("_", n = 2, expand = True)
  
# making separate columns from new data frame
df["region"]= new[0]
df["gender"]= new[1]
df["age"]= new[2]

Output using df.head()

    Campaign_name     region    gender   age
0   US_FEMALE_20to30    US      FEMALE   20to30
1   US_MIX_35to45       US      MIX      35to45
2   US_FEMALE_20to30    US      FEMALE   20to30
3   US_MIX_35to45       US      MIX      35to45
4   US_MALE_30to35      US      MALE     30to35

Upvotes: 1

Related Questions