Newbie123
Newbie123

Reputation: 21

Splitting the column values based on a delimiter (Pandas)

I have a panda dataframe with a column name - AA_IDs. The column name values has a special character "-#" in few rows. I need to determine three things:

  1. Position of these special characters or delimiters
  2. find the string before the special character
  3. Find the string after the special character

E.g. AFB001 9183Daily-#789876A

Answer would be before the delimiter - AFB001 9183Daily and after the delimiter - 789876A

Upvotes: 0

Views: 1424

Answers (2)

Akshay Sehgal
Akshay Sehgal

Reputation: 19322

Just use apply function with split -

df['AA_IDs'].apply(lambda x: x.split('-#'))

This should give you a series with a list for each row as [AFB001 9183Daily, 789876A]

This would be significantly faster than using regex, and not to mention the readability.

Upvotes: 2

kwehmeyer
kwehmeyer

Reputation: 63

So lets say the dataframe is called df and the column with the text is A. You can use

import re # Import regex

pattern = r'<your regex>'

df['one'] = df.A.str.extract(pattern)

This creates a new column containing the extracted text. You just need to create a regex to extract what you want from your string(s). I highly recommend regex101 to help you construct your regex.

Hope this helps!

Upvotes: 0

Related Questions