Reputation: 3
Here is a valid property tree expression (it can be recursive):
rootProperty:(prop1, prop2, subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3), prop3, etc)
So in effect a property can have many properties and sub-properties. From this expression I would like to capture the following:
I tried few approaches but could't get the repetitions working recursively. Hence seeking help.
Thanks Kannan
Upvotes: 0
Views: 151
Reputation: 2716
This is not a regular language due to recursion (balanced parens), so a regular expression might not be what you need. But assuming you know what you are doing:
([^:(), ]+)(?::\(((?R)?(?:, ?(?R))*)\))?
First we capture the name of the property: one or more characters that are not :(),
.
([^:(), ]+)
A property may or may not have a subtree, so the next part is the optional subtree:
(?: <--- do not capture
: <--- literal ':'
\( <--- literal '('
... <--- some stuff inside
\) <--- literal ')'
)? <--- it is optional
The stuff inside captures a list of properties:
( <--- do capture
(?R) <--- recursively match a property
(?: <--- do not capture
, ? <--- comma followed by optional space
(?R) <--- recursively match another property
)* <--- any number of comma separated properties
) <--- end capture
For your example input:
Input:
rootProperty:(prop1, prop2, subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3), prop3, etc)
Match 1:
rootProperty:(prop1, prop2, subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3), prop3, etc)
Group 1:
rootProperty
Group 2:
prop1, prop2, subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3), prop3, etc
You could then recursively match the second group of each match for capturing the properties of a subtree. There should be a way to get the backtracking information so you don't need to do this, but I don't know how.
Input:
prop1, prop2, subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3), prop3, etc
Match 1:
prop1
Match 2:
prop2
Match 3:
subProp1:(prop1,subSubProp1:(prop1,prop2,etc),prop3)
Group 1:
subProp1
Group 2:
prop1,subSubProp1:(prop1,prop2,etc),prop3
Match 4:
prop3
Match 5:
etc
Then,
Input:
prop1,subSubProp1:(prop1,prop2,etc),prop3
Match 1:
prop1
Match 2:
subSubProp1:(prop1,prop2,etc)
Group 1:
subSubProp1
Group 2:
prop1,prop2,etc
Match 3:
prop3
And finally:
Input:
prop1,prop2,etc
Match 1:
prop1
Match 2:
prop2
Match 3:
etc
https://regex101.com/r/WAXrFd/2
Upvotes: 1