Reputation: 13
I am looking to do a Regex conditional search.
What I am looking to do is if there is Carriage Return (\r) followed by Upper and Lower Case alphabets the I want to add space ('') and remove carriage return but if after carriage there is anything else I just want to replace that. Is there a way I can do that using regex in Python
Sample Input:
BCP-\rEngin\reerin\rg\rSyste\rms\rSupp\rort
Output:
BCP- Engineering Systems Support
Data is in form of dataframe. I am currently using df.replace() function to replace "\r" with spaces (" ") but I would like it to be conditional.
Below is my code -
df_replace = df.replace(to_replace=r"\r", value = " ", regex=True)
Upvotes: 0
Views: 213
Reputation: 2945
I am not familiar with python, but the regex you will need is as follows (perhaps someone with python experience can edit to customize this code):
This will find all \r
that precede an uppercase letter, so replace this with an empty string:
\\r(?![A-Z])
This will find all \r
that precede a lowercase letter, so replace this with a space:
\\r(?![a-z])
EDIT
Okay, here's one solution in Python I was able to put together for you:
import re
myString = "BCP-\rEngin\reerin\rg\rSyste\rms\rSupp\rort"
myString = re.sub("\\r(?![A-Z])", "", myString)
myString = myString.replace("\r", " ") # This can be simple string replace
Upvotes: 2
Reputation: 13
I was able to get the solution for this -
df_replace2 = df.replace(to_replace = r"(\r)(?![A-Z])", value = "", regex=True)
df_replace3 = df_replace2.replace(to_replace = r"(\r)(?![a-z])", value = " ", regex=True)
Thanks @Brigadeiro for guiding with the solution
Upvotes: 0