user1179317
user1179317

Reputation: 2903

Use regex to parse string into groups

I cant seem to get regex to work with the following example. Basically I would like to parse 4 groups from a string such as below:

test.this
test[extra].this
test[extra].this{data}
test.this{data}

I would like to get the answer as such, for the examples above respectively:

val1='test', val2=None, val3='this', val4=None
val1='test', val2='extra', val3='this', val4=None
val1='test', val2='extra', val3='this', val4='data'
val1='test', val2=None, val3='this', val4='data'

I tried this but it's not working:

import re

tests = ["test.this",
         "test[extra].this",
         "test[extra].this{data}",
         "test.this{data}",]

for test in tests:
    m = re.match(r'^([^\[\.]+)(?:\[([^\]]+)])(?:\.([^{]+){)([^}]+)?$', test)
    if m:
        print(test, '->', m[1], m[2], m[3], m[4])

Upvotes: 0

Views: 118

Answers (2)

Karan Raj
Karan Raj

Reputation: 442

import re

tests = ["test.this",
     "test[extra].this",
     "test[extra].this{data}",
     "test.this{data}"]

pat = re.compile(r'(\w+)([\[])?(\w+)?([\]])?\.(\w+){?(\w+)?')
for test in tests:
    x = pat.search(test)
    print(x.group(1),x.group(3),x.group(5),x.group(6))

(\w+) -> captures test

([\[])? -> captures [

(\w+)? -> captures extra

([\]])? -> captures ]

(\w+) -> captures this

(\w+)? -> captures data

Upvotes: 1

41686d6564
41686d6564

Reputation: 19641

If only the second and fourth groups are optional, you may use:

^([^\[\.]+)(?:\[([^\]]+)])?\.([^{\r\n]+)(?:{([^}\r\n]+)})?$

Demo.

Note that \r and \n were added in the negated character classes of the third and fourth groups to avoid going beyond the end of the line. If you're only using single-line strings, that won't be necessary.

Upvotes: 2

Related Questions