user2879704
user2879704

Reputation:

Python regex, pattern-match multiple backslash characters

I have a python raw string, that has five backslash characters followed by a double quote. I am trying to pattern-match using python re.

The output must print the matching pattern. In addition, two characters before/after the pattern.

import re
command = r'abc\\\\\"abc'
search_string = '.{2}\\\\\\\\\\".{2}'
pattern = re.compile(search_string)
ts_name = pattern.findall(command)
print ts_name

The output shows,

['\\\\\\\\"ab']

I expected

['bc\\\\\"ab']

Anomalies:

1) Extra characters at the front - ab are missing

2) Magically, it prints eight backslashes when the input string contains just five backslashes

Upvotes: 3

Views: 972

Answers (2)

anubhava
anubhava

Reputation: 784968

You can simplify (shorten) your regex and use search function to get your output:

command = r'abc\\\\\"abc'
search_string = r'.{2}(?:\\){5}".{2}'
print re.compile(search_string).search(command).group()

Output:

bc\\\\\"ab

Your regex should also use r prefix.

Upvotes: 3

Lawrence Benson
Lawrence Benson

Reputation: 1406

just add a capturing group around the part you want:

command = r'a(bc\\\\\"ab)c'

and access it with:

match.group(1)

Upvotes: 2

Related Questions