blah
blah

Reputation: 664

Regex in Python to remove all uppercase characters before a colon

I have a text where I would like to remove all uppercase consecutive characters up to a colon. I have only figured out how to remove all characters up to the colon itself; which results in the current output shown below.

Input Text

text = 'ABC: This is a text. CDEFG: This is a second text. HIJK: This is a third text'

Desired output:

 'This is a text. This is a second text. This is a third text'

Current code & output:

re.sub(r'^.+[:]', '', text)

#current output
'This is a third text'

Can this be done with a one-liner regex or do I need to iterate through every character.isupper() and then implement regex ?

Upvotes: 0

Views: 63

Answers (1)

The fourth bird
The fourth bird

Reputation: 163457

You can use

\b[A-Z]+:\s*
  • \b A word boundary to prevent a partial match
  • [A-Z]+: Match 1+ uppercase chars A-Z and a :
  • \s* Match optional whitespace chars

Regex demo

import re

text = 'ABC: This is a text. CDEFG: This is a second text. HIJK: This is a third text'
print(re.sub(r'\b[A-Z]+:\s*', '', text))

Output

This is a text. This is a second text. This is a third text

Upvotes: 1

Related Questions