Reputation: 2903
I cant seem to get regex to work with the following example. Basically I would like to parse 4 groups from a string such as below:
test.this
test[extra].this
test[extra].this{data}
test.this{data}
I would like to get the answer as such, for the examples above respectively:
val1='test', val2=None, val3='this', val4=None
val1='test', val2='extra', val3='this', val4=None
val1='test', val2='extra', val3='this', val4='data'
val1='test', val2=None, val3='this', val4='data'
I tried this but it's not working:
import re
tests = ["test.this",
"test[extra].this",
"test[extra].this{data}",
"test.this{data}",]
for test in tests:
m = re.match(r'^([^\[\.]+)(?:\[([^\]]+)])(?:\.([^{]+){)([^}]+)?$', test)
if m:
print(test, '->', m[1], m[2], m[3], m[4])
Upvotes: 0
Views: 118
Reputation: 442
import re
tests = ["test.this",
"test[extra].this",
"test[extra].this{data}",
"test.this{data}"]
pat = re.compile(r'(\w+)([\[])?(\w+)?([\]])?\.(\w+){?(\w+)?')
for test in tests:
x = pat.search(test)
print(x.group(1),x.group(3),x.group(5),x.group(6))
(\w+) -> captures test
([\[])? -> captures [
(\w+)? -> captures extra
([\]])? -> captures ]
(\w+) -> captures this
(\w+)? -> captures data
Upvotes: 1
Reputation: 19641
If only the second and fourth groups are optional, you may use:
^([^\[\.]+)(?:\[([^\]]+)])?\.([^{\r\n]+)(?:{([^}\r\n]+)})?$
Demo.
Note that \r
and \n
were added in the negated character classes of the third and fourth groups to avoid going beyond the end of the line. If you're only using single-line strings, that won't be necessary.
Upvotes: 2