Reputation: 6531
I have the following text contained within a field in my DB:
[quote:5a7b87febe="mr smith"]This is some text.
This is more text on another line.[/quote:5a7b87febe]
I am trying to construct a regular expression that will convert any instances like the above, into:
<div><h4>Posted by mr smith</h4>This is some text.
This is more text on another line.</div>
The pattern I have put together so far appears to work for instances where there is no line break in the enclosed text, but in the above example where there is text on another line, the pattern is not matched.
The C# code I have so far is:
var exp = new Regex(@"(\[quote)(:\w+=\"")(.*?)(\""\])(.*?)(\[\/quote)(:\w+\])");
var str = exp.Replace(str, "<div><h4>Posted by $3</h4>$5</div>");
I am rubbish at Regular Expressions so am unsure how to handle 'any' characters that appear between the opening and closing 'quote' tags.
Ideally, I would also like the expression to handle nested instances of the above example if possible.
One other thing worth mentioning is that the series of characters that follow the 'quote:' tags are unique every time, and the name within quotes will also vary.
Upvotes: 3
Views: 317
Reputation: 30618
You would need to use a Backreference to match the unique number in the opening tag. Something like this should work for you:
var regex = new Regex(@"\[(quote:[a-z0-9]+)(=""([^""]+)?"")?\](.*)\[/\1\]", RegexOptions.SingleLine);
var str = regex.Replace(str, "<div><h4>Posted by $3</h4>$4</div>");
This solution has been tested with your input, but not with nested quotes. This will be a bit trickier.
EDIT: After checking this solution with nested quotes, it does work. You just need to call it repeatedly until no more replacements are made. The first time it will match the outer quote and leave the inner quote intact inside the replacement. Sample code for doing this (untested):
// Repeatedly call this replacement
string last;
do
{
last = str;
str = regex.Replace(str, "<div><h4>Posted by $3</h4>$4</div>");
} while (last != str);
Upvotes: 5