Reputation: 27
I am trying to create a Python regex to capture a file name, but only if the text "external=true" appears within the square brackets after the alleged file name.
I believe I am nearly there, but am missing a specific use-case. Essentially, I want to capture the text between qrcode:
and the first [
, but only if the text external=true
appears between the two square brackets.
I have created the regex qrcode:([^:].*?)\[.*?external=true.*?\]
, which does not work for the second line below: it incorrectly returns vcard3.txt and does not return vcard4.txt.
qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]
https://regex101.com/r/bh3IMb/3
Upvotes: 0
Views: 58
Reputation: 3989
As an alternative you can use
qrcode:([\w\.]+)(?=\[[\w\=,]*external=true[^\]]*)
See the regex demo.
Python demo:
import re
regex = re.compile(r"qrcode:([\w\.]+)(?=\[[\w\=,]*external=true[^\]]*)")
sample = """
qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]
"""
print(regex.findall(sample))
Output:
['vcard1.txt', 'vcard4.txt', 'vcard5.txt']
Upvotes: 1
Reputation: 1158
Using positive look-ahead (for qrcode:
) and positive look-behind (for [*external=true
with lazy matching to capture the smallest of such groups.
Regex101 explanation: https://regex101.com/r/bOezIm/1
A complete python example:
import re
pattern = r"(?<=qrcode:)[^:]*?(?=\[[^\]]*?external=true)"
string = """
qrcode:vcard1.txt[external=true] qrcode:vcard2.txt[xdim=2,ydim=2]
qrcode:vcard3.txt[xdim=2,ydim=2] qrcode:vcard4.txt[xdim=2,ydim=2,external=true]
qrcode:vcard5.txt[xdim=2,ydim=2,external=true,foreground=red,background=white]
qrcode:https://www.github.com[foreground=blue]
"""
print(re.findall(pattern, string))
Upvotes: 1