rajnish chauhan
rajnish chauhan

Reputation: 73

Parsing string in Pandas

I am working on a dataframe with one of the columns with values like this -

field
marketable_email_status_m10
email_availability_status_m11
ending_ar_60_to_89_dpd_m11
email_availability_status_m1

I want my final output such that the string is split in two columns as below:

field text1 text2
marketable_email_status_m10 marketable_email_status m10
email_availability_status_m11 email_availability_status m11
ending_ar_60_to_89_dpd_m11 ending_ar_60_to_89_dpd m11
email_availability_status_m1 email_availability_status m1

I have been able to yield column 3, but not sure how to go about column 2.

Upvotes: 3

Views: 65

Answers (2)

jezrael
jezrael

Reputation: 862611

Use Series.str.rsplit with n=1 for split by last _:

df[['text1','text2']] = df['field'].str.rsplit('_', n=1, expand=True)
print (df)
                           field                      text1 text2
0    marketable_email_status_m10    marketable_email_status   m10
1  email_availability_status_m11  email_availability_status   m11
2     ending_ar_60_to_89_dpd_m11     ending_ar_60_to_89_dpd   m11
3   email_availability_status_m1  email_availability_status    m1

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133508

With extract function please try following.

df[["text1","text2"]] = df['field'].str.extract(r'^(.*)_(.*)$')

Explanation:

  • Applying df.str.extract function on DataFrame's field column.
  • Then using 2 capturing groups to create 2 new columns in DataFrame, named text1 and text2.
  • First capturing group has everything till _ and 2nd one has rest of the value(as per OP's requirement).
  • Saving value of created capturing groups into fields named text1 and text2.

Output will be as follows:

    field                           text1                       text2
0   marketable_email_status_m10     marketable_email_status     m10
1   email_availability_status_m11   email_availability_status   m11
2   ending_ar_60_to_89_dpd_m11      ending_ar_60_to_89_dpd      m11
3   email_availability_status_m1    email_availability_status   m1

Upvotes: 3

Related Questions