Split a dataframe column with regex

Question

I'm trying to split a column that has a specific delimiter like: '|'.

My data looks like this, I have ONLY ONE COLUMN named "ID" that contains those strings that I want to split based on delimiter " | "

ID accountsummary            | Name: Report Suite Totals 

ID activity                  | Name: Activity

I've tried with 2 different approaches:

dataframe_elements_int[['ID', 'Name']] = \
    dataframe_elements_int['ID'].str.rsplit('|', expand=True, n=1)

Which gives me the following error: ValueError: Columns must be same length as key

dataframe_final[['Id','Name']] = \
    dataframe_elements_int['ID'].str.extract('(\w*)\|(\w*)', expand=True)

Which gives me the following error: ValueError: pattern contains no capture groups

Sreekiran A R · Accepted Answer

You can try

df=dataframe_elements_int
#split the column
df['new_ID'], df['Name'] = df['ID'].str.split('|').str
#filtering the Name and ID 
df['Name']=df['Name'].str.extract(r'((?<=Name:).*$)', expand=True)
df['new_ID']=df['new_ID'].str.extract(r'((?<=ID).*$)',expand=True)

Split a dataframe column with regex

Answers (2)

Related Questions