aldorado
aldorado

Reputation: 4844

How to match carriage return with no linefeed in python

I am trying to use regular expressions in python to find carriage returns \r in a string which have no following linefeed \n, thus being a likely error.

However, my regex allways matches, and I do not know why:

>>> import re
>>> cr = re.compile("\r[^\n]", re.X)
>>> rr= rege.search("bla")
>>> rr
<_sre.SRE_Match object at 0x0000000003793AC0>

How would the correct syntax look like?

Upvotes: 2

Views: 2778

Answers (2)

user2357112
user2357112

Reputation: 280887

You're using verbose mode (re.X) and a non-raw string. That means your regex has a literal carriage return and line feed in it. As these are whitespace characters and you're in verbose mode, the carriage return outside a character class is completely ignored. Your regex is effectively r'[^\n]', matching any non-line feed character.

Use a raw string. As others suggested, a negative lookahead would also be better for a "not followed by" assertion:

r'\r(?!\n)'

Upvotes: 1

Jan
Jan

Reputation: 43169

You can use a negative lookahead:

\r(?!\n)

In Python:

import re
rx = r'\r(?!\n)'
for match in re.finditer(rx, string):
    # do sth. here

Upvotes: 1

Related Questions