epitka
epitka

Reputation: 17627

Regex expression challenge

Can somebody put a regex expression that will:

  1. find a chunk that starts with [% and ends with %]
  2. within that chunk replace all xml special characters with:
    & quot; & apos; & lt; & gt; & amp;
  3. leave everything between <%= %> or <%# %> as is except make sure that there is space after <%# or <%= and before %> for example <%=Integer.MaxValue%> should become <%= Integer.MaxValue %>

source:

[% 'test' <mtd:ddl id="asdf" runat="server"/> & <%= Integer.MaxValue% > %]

result:

&apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%= Integer.MaxValue %>

Upvotes: 1

Views: 858

Answers (3)

Ian Ringrose
Ian Ringrose

Reputation: 51897

I think the code will be clear without the use of RegEx. I would tend to write a separate method (and unit test) for each line of your spec then chain them together.

See also "When not to use Regex in C# (or Java, C++ etc)"

Upvotes: 0

Ahmad Mageed
Ahmad Mageed

Reputation: 96477

Used 2 regular expressions. 1st to match the general form, 2nd to deal with the inner plumbing.

For the XML encoding I used an obscure little method found in System.Security: SecurityElement.Escape Method. I fully qualified it in the code below for emphasis. Another option would be using the HttpUtility.HtmlEncode method but that may involve a reference to System.Web depending on where you're using this.

string[] inputs = { @"[% 'test' <mtd:ddl id=""asdf"" runat=""server""/> & <%= Integer.MaxValue %> %]",
    @"[% 'test' <mtd:ddl id=""asdf"" runat=""server""/> & <%=Integer.MaxValue %> %]",
    @"[% 'test' <mtd:ddl id=""asdf"" runat=""server""/> & <%# Integer.MaxValue%> %]",
    @"[% 'test' <mtd:ddl id=""asdf"" runat=""server""/> & <%#Integer.MaxValue%> %]",
};
string pattern = @"(?<open>\[%)(?<content>.*?)(?<close>%])";
string expressionPattern = @"(?<content>.*?)(?<tag><%(?:[=#]))\s*(?<expression>.*?)\s*%>";

foreach (string input in inputs)
{
    string result = Regex.Replace(input, pattern, m =>
        m.Groups["open"].Value +
        Regex.Replace(m.Groups["content"].Value, expressionPattern,
            expressionMatch =>
            System.Security.SecurityElement.Escape(expressionMatch.Groups["content"].Value) +
            expressionMatch.Groups["tag"].Value + " " +
            expressionMatch.Groups["expression"].Value +
            " %>"
        ) +
        m.Groups["close"].Value
    );

    Console.WriteLine("Before: {0}", input);
    Console.WriteLine("After: {0}", result);
}

Results:

Before: [% 'test' <mtd:ddl id="asdf" runat="server"/> & <%= Integer.MaxValue %> %]
After: [% &apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%= Integer.MaxValue %> %]
Before: [% 'test' <mtd:ddl id="asdf" runat="server"/> & <%=Integer.MaxValue %> %]
After: [% &apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%= Integer.MaxValue %> %]
Before: [% 'test' <mtd:ddl id="asdf" runat="server"/> & <%# Integer.MaxValue%> %]
After: [% &apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%# Integer.MaxValue %> %]
Before: [% 'test' <mtd:ddl id="asdf" runat="server"/> & <%#Integer.MaxValue%> %]
After: [% &apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%# Integer.MaxValue %> %]

EDIT: if you don't care to preserve the opening/closing [%%] in the final result then change the pattern to:

string pattern = @"\[%(?<content>.*?)%]";

Then be sure to remove references to m.Groups["open"].Value and m.Groups["close"].Value.

Upvotes: 2

ebattulga
ebattulga

Reputation: 10981

private void button1_Click(object sender, EventArgs e)
        {
            Regex reg = new Regex(@"\[%(?<b1>.*)%\]");
            richTextBox1.Text= reg.Replace(textBox1.Text, new MatchEvaluator(f1));
        }

        static string f1(Match m)
        {
            StringBuilder sb = new StringBuilder();
            string[] a = Regex.Split(m.Groups["b1"].Value, "<%[^%>]*%>");
            MatchCollection col = Regex.Matches(m.Groups["b1"].Value, "<%[^%>]*%>");
            for (int i = 0; i < a.Length; i++)
            {
                sb.Append(a[i].Replace("&", "&amp;").Replace("'", "&apos;").Replace("\"", "&quot;").Replace("<", "&lt;").Replace(">", "&gt;"));
                if (i < col.Count)
                    sb.Append(col[i].Value);
            }
            return sb.ToString();
        }

Test1:

[% 'test' <mtd:ddl id="asdf" runat="server"/> & <%= Integer.MaxValue%> fdas<% hi%> 321%]

result:

 &apos;test&apos; &lt;mtd:ddl id=&quot;asdf&quot; runat=&quot;server&quot;/&gt; &amp; <%= Integer.MaxValue%> fdas<% hi%> 321

Upvotes: 1

Related Questions