Reputation: 2556
(\[(c|C)=)(#?([a-fA-F0-9]{1,2}){3})\](.*)\[/(c|C)\]
I want this expression to match text like: "This is [c=FFFFFF]white text[/c] and [C=#000]black text[/C]."
It do match one BB-code alone, but if there are more after each other (like in the example), it will create a match (1 match) of both BB-code-sequences. (from [c=FFFFFF]wh... to ...ck text[/C])
Why is this happening? Also, how do I make the dot (.) include newlines in C#?
Upvotes: 1
Views: 433
Reputation: 20769
You need a lazy regular expression to not pick up all of the [c] tags
Try this
\[c=(#?.*?)\](.*?)\[/c\] or
\[c=(#?\w*?)\](\w*?)\[/c\]
You should set the options on your regex object to ingnore case.
Upvotes: 0
Reputation: 17152
If you don't care about nested tags, you can do that :
(\[[cC]=)(#?([a-fA-F0-9]{3}){1,2})\](.*?)\[/[cC]\]
// ^- lazy match
If you want to handle nested tags with regex, check this article on code project.
Upvotes: 3
Reputation: 2253
Regex is a quick an dirty way to do this, and the solution here is to use .*?
rather than just .*
. However, if you want a more robust solution is probably easier without regex. In C# you happen to be able to do nested structures, but that doesn't mean it's actually easy. It would be better to use a lexical parser and construct a DOM. Most likely the code will be easier to read and maintain.
Upvotes: 0
Reputation: 96
Dot matches newline characters if you set the option RegexOptions.Singleline
(more on that here).
Upvotes: 2