Reputation: 439
consider the following input:
"aaa"|"bbb"|"123"|"!"\\"|"2010-01-04T00:00:01"
I am trying to write a regex that will capture and replace the double quote character with tilde if...
|
ANDIn PHP I am able to get the regex pictured below working...
Due to constraints on the python regex, the same regex fails with the following error:
re.error: look-behind requires fixed-width pattern
my python code is as follows:
import re
orig_line = r'"aaa"|"bbb"|"123"|"!"\\"|"2010-01-04T00:00:01"'
new_line = re.sub(pattern='(?<!\||^)\"(?!\||$)',repl='~',string=orig_line)
How can I adjust this regex so it works in python?
Similar questions exist on SO, but I couldn't find any that address the start/end of line requirement.
Upvotes: 2
Views: 752
Reputation: 36380
I would approach it following way: as you are interested in " which is not at start we can express it as having one non-newline before i.e. using positive lookbehind that is:
import re
orig_line = r'"aaa"|"bbb"|"123"|"!"\\"|"2010-01-04T00:00:01"'
new_line = re.sub(pattern='(?<=.)(?<!\|)\"(?!\||$)',repl='~',string=orig_line)
print(new_line)
output:
"aaa"|"bbb"|"123"|"!~\\"|"2010-01-04T00:00:01"
If you are not limited to python standard library I suggest trying regex which does support variable-length lookbehinds for example:
import regex as re
text = "a1aa2aaa3aaaa4"
print(re.findall('(?<=a{3,})[0-9]', text))
output:
['3', '4']
Upvotes: 1
Reputation: 626748
You can use
(?<=[^|])
The (?<=[^|])
matches a location that is immediately preceded with any char but |
and thus it cannot match at the start of the string.
See the Python demo:
import re
orig_line = '"aaa"|"bbb"|"123"|"!"\\"|"2010-01-04T00:00:01"'
new_line = re.sub(r'(?<=[^|])"(?=[^|])', '~', orig_line)
print(new_line) # => "aaa"|"bbb"|"123"|"!~\"|"2010-01-04T00:00:01"
Upvotes: 1