Regular expression for parsing

Question

I'm looking for a regex in order to transform something like

{test}hello world{/test} and {again}i'm coming back{/again} in hello world i'm coming back.

I tried {[^}]+} but with this regex, I can't have only what I have in the test and again tags. Is there a way to complete this regex ?

Martin Ender · Accepted Answer

Doing this properly is generally beyond the capabilities of regular expressions. However, if you can guarantee that those tags will never be nested and your input will never contain curly brackets that do not signify tags, then this regex could do the matching:

\{([^}]+)}(.*?)\{/\1}

Explanation:

\{        # a literal {
(         # capture the tag name
[^}]+)    # everything until the end of the tag (you already had this)
}         # a literal }
(         # capture the tag's value
.*?)      # any characters, but as few as possible to complete the match
          # note that the ? makes the repetition ungreedy, which is important if
          # you have the same tag twice or more in a string
\{        # a literal {
\1        # use the tag's name again (capture no. 1)
}         # a literal }

So this uses a backreference \1 to make sure that the closing tag contains the same word as the opening tag. Then you will find the tag's name in capture 1 and the tag's value/content in capture 2. From here you can do with these whatever you want (for instance, put the values back together).

Note that you should use the SINGLELINE or DOTALL option, if you want your tags to span multiple lines.

Regular expression for parsing

Answers (1)

Related Questions