Marley
Marley

Reputation: 147

Regex: remove string that is not prefixed with a specific character

Using a regular expression, is there a way to remove characters that don't begin with a specific prefix?

For example (and more specifically), in the below string I'd like to only remove new line breaks that don't immediately follow a semi colon:

Initial string: "key:\\n value\\n here\\n"

Desired output string (result) "key:\\n value here"

I've tried using re.sub(r"[^:]\\n", "", "key:\\n value\\n here\\n") However, this does not return the desired result, and instead returns the following: "key:\\n valu her"

Any assistance would be appreciated.

Upvotes: 0

Views: 127

Answers (1)

orlp
orlp

Reputation: 117741

What you want is called a negative lookbehind assertion. In Python's re it takes the shape of (?<!...) where ... is the thing that should not be behind whatever comes next.

>>> s = "key:\\n value\\n here\\n"
>>> re.sub(r"(?<!:)\\n", "", s)
'key:\\n value here'

Upvotes: 2

Related Questions