Reputation: 2866
I have text like this:
This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text
I want to parse out data like that:
Name: name1
Value: value1
Name: name2
Value: {name3:even dipper {name4:valu4} dipper} some inner text
I would then recursively process each value to parse out nested fields. Can you recommend a RegEx expression to do this?
Upvotes: 0
Views: 2145
Reputation: 33908
In C# you can use balancing groups to count and balance the brackets:
{ (?'name' \w+ ) : # start of tag
(?'value' # named capture
(?> # don't backtrack
(?:
[^{}]+ # not brackets
| (?'open' { ) # count opening bracket
| (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
)*
)
(?(open)(?!)) # make sure open is not > 0
)
} # end of tag
string re = @"(?x) # enable eXtended mode (comments/spaces ignored)
{ (?'name' \w+ ) : # start of tag
(?'value' # named capture
(?> # don't backtrack
(?:
[^{}]+ # not brackets
| (?'open' { ) # count opening bracket
| (?'close-open' } ) # subtract closing bracket (matches only if open count > 0)
)*
)
(?(open)(?!)) # make sure open is not > 0
)
} # end of tag
";
string str = @"This is {name1:value1}{name2:{name3:even dipper {name4:valu4} dipper} some inner text} text";
foreach (Match m in Regex.Matches(str, re))
{
Console.WriteLine("name: {0}, value: {1}", m.Groups["name"], m.Groups["value"]);
}
Output:
name: name1, value: value1
name: name2, value: {name3:even dipper {name4:valu4} dipper} some inner text
Upvotes: 3
Reputation: 33908
If using Perl/PHP/PCRE it's not complicated at all. You can use an expression like:
{(\w+): # start of tag
((?:
[^{}]+ # not a tag
| (?R) # a tag (recurse to match the whole regex)
)*)
} # end of tag
Upvotes: 2