Exorcismus
Exorcismus

Reputation: 2482

Regex working in notepad++ but not in python

am trying this regex (WVDC)((?:.*\r\n){1}) in notepad++ and it's working, but when I do the same in python it won't

text is

Above 85°C the rated (DC/AC) voltage must be derated at per 1.5%/2.5%°C
WVDC: 400 Volts DC
SVDC: 600 Volts DC

python code

re.search(r'(WVDC)((?:.*\r\n){1})',txt)

Upvotes: 1

Views: 496

Answers (2)

Mark Tolonen
Mark Tolonen

Reputation: 177600

You haven't shown a reproducible example, but opening files in Python in text mode will convert \r\n to \n. Notepad++ maintains the exact line endings.

Removing \r (or making it optional) from the regex should fix the problem in Python. You could also open the file in binary mode, but processing text in text mode is recommended.

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521073

The following script is working for me in Python:

input = """Above 85°C the rated (DC/AC) voltage must be derated at per 1.5%/2.5%°C
WVDC: 400 Volts DC
SVDC: 600 Volts DC"""

result = re.findall(r'(WVDC).*\r?\n', input)
print(result)

['WVDC']

Note that the only substantial change I made to the regex pattern was to make the carriage return \r optional. So it seems that multiline strings in Python, perhaps what your source uses, carry only newlines, but not carriage returns. In any case, using \r?\n to match newlines is generally a good idea, because it can cover both Unix and Windows line endings at the same time.

Upvotes: 4

Related Questions