Reputation: 4844
I am trying to use regular expressions in python to find carriage returns \r in a string which have no following linefeed \n, thus being a likely error.
However, my regex allways matches, and I do not know why:
>>> import re
>>> cr = re.compile("\r[^\n]", re.X)
>>> rr= rege.search("bla")
>>> rr
<_sre.SRE_Match object at 0x0000000003793AC0>
How would the correct syntax look like?
Upvotes: 2
Views: 2778
Reputation: 280887
You're using verbose mode (re.X
) and a non-raw string. That means your regex has a literal carriage return and line feed in it. As these are whitespace characters and you're in verbose mode, the carriage return outside a character class is completely ignored. Your regex is effectively r'[^\n]'
, matching any non-line feed character.
Use a raw string. As others suggested, a negative lookahead would also be better for a "not followed by" assertion:
r'\r(?!\n)'
Upvotes: 1
Reputation: 43169
You can use a negative lookahead:
\r(?!\n)
In Python
:
import re
rx = r'\r(?!\n)'
for match in re.finditer(rx, string):
# do sth. here
Upvotes: 1