Dreams
Dreams

Reputation: 6122

regex - Substitute a string till you encounter second capital letter

I have a string which may have multiple instances of substrings with the pattern of "WORLD/XyzRights", "WORLD/abcNext". That is, the "WORLD" followed by "/" and then a word, and another word starting with capital letter. I want to replace the string to "Rights" and "Next" respectively.

That is the expected output is : removing the "World/string" till the next capital letter although the letter just after the "/" can also be capital, but we should remove that too. So, in the above 2 cases : "Rights" and "Next"

I tried this :

re.sub("""WORLD\/[A-Za-z]+(.*?)[^A-Z]""", " ", completeText, flags=re.S)

But, this removes "Rights" and "Next" also and keeps the remaining string

Upvotes: 1

Views: 259

Answers (3)

Sameer Mahajan
Sameer Mahajan

Reputation: 598

Assuming that you always have at least one character before your word that you want to keep after '/', you can try the following regular expression:

WORLD/[a-zA-Z][^A-Z]*

to match your pattern to be removed. It works for both of your examples.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522064

I would use the following pattern for replacement:

WORLD/.*([A-Z].*)

And then just replace with the captured group \1. This says to greedily match and consume everything after the first slash until hitting the last capital letter, which is the start of the word we want to capture. Then, capture that final word and use it in the replacement.

re.sub("""WORLD/.*([A-Z].*)""", r"\1", "WORLD/XyzRights", flags=re.S)

Demo

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174786

Just add an optional pattern to match the first captial letter which exists next to /.

>>> import re
>>> s = ["WORLD/XyzRights", "WORLD/abcNext"]
>>> [re.sub(r'WORLD/[A-Z]?[a-z]+([A-Z])', r'\1', i) for i in s]
['Rights', 'Next']
>>> 

Upvotes: 2

Related Questions