Kevin Nash
Kevin Nash

Reputation: 1561

Pandas - Rename columns by removing text before delimiter

Given below are the column names of my Dataframe

['user_id', 'Week 36~Sep-05 - Sep-11',
 'Week 35~Aug-29 - Sep-04', 'Week 34~Aug-22 - Aug-28']

I would like remove the text before the delimiter (~) if it is there in the column label and get the below column names

['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28']

I tried the below but it failed

[col.split('~')[1] for col in df.columns]

Error : IndexError: list index out of range

Upvotes: 1

Views: 68

Answers (2)

sayan dasgupta
sayan dasgupta

Reputation: 1082

It is happening because you are trying to split user_id using ~ and there is nothing at index 1 Try this

df.columns = [col.split('~')[1] if col.startswith('Week') else col  for col in df.columns]

Upvotes: 0

mozway
mozway

Reputation: 261015

I would use str.replace here:

df.columns = df.columns.str.replace(r'.*~', '', regex=True)

output:

Index(['user_id', 'Sep-05 - Sep-11', 'Aug-29 - Sep-04', 'Aug-22 - Aug-28'], dtype='object')

input:

Index(['user_id', 'Week 36~Sep-05 - Sep-11', 'Week 35~Aug-29 - Sep-04',
       'Week 34~Aug-22 - Aug-28'],
      dtype='object')

You approach would work with -1 indexing:

df.columns = [col.split('~')[-1] for col in df.columns]

Upvotes: 2

Related Questions