Regex in Python to remove all uppercase characters before a colon

Question

I have a text where I would like to remove all uppercase consecutive characters up to a colon. I have only figured out how to remove all characters up to the colon itself; which results in the current output shown below.

Input Text

text = 'ABC: This is a text. CDEFG: This is a second text. HIJK: This is a third text'

Desired output:

 'This is a text. This is a second text. This is a third text'

Current code & output:

re.sub(r'^.+[:]', '', text)

#current output
'This is a third text'

Can this be done with a one-liner regex or do I need to iterate through every character.isupper() and then implement regex ?

The fourth bird · Accepted Answer

You can use

\b[A-Z]+:\s*

\b A word boundary to prevent a partial match
[A-Z]+: Match 1+ uppercase chars A-Z and a :
\s* Match optional whitespace chars

Regex demo

import re

text = 'ABC: This is a text. CDEFG: This is a second text. HIJK: This is a third text'
print(re.sub(r'\b[A-Z]+:\s*', '', text))

Output

This is a text. This is a second text. This is a third text

Regex in Python to remove all uppercase characters before a colon

Answers (1)

Related Questions