Reputation: 32179
I have strings in the form: "[user:fred][priority:3]Lorem ipsum dolor sit amet." where the area enclosed in square brackets is a tag (in the format [key:value]). I need to be able to remove a specific tag given it's key with the following extension method:
public static void RemoveTagWithKey(this string message, string tagKey) {
if (message.ContainsTagWithKey(tagKey)) {
var regex = new Regex(@"\[" + tagKey + @":[^\]]");
message = regex.Replace(message , string.Empty);
}
}
public static bool ContainsTagWithKey(this string message, string tagKey) {
return message.Contains(string.Format("[{0}:", tagKey));
}
Only the tag with the specified key should be removed from the string. My regex doesn't work because it's daft. I need help to write it properly. Alternatively, an implementation without regex is welcome.
Upvotes: 2
Views: 176
Reputation: 13024
I know there are much more feature-rich tools out there, but I like the simplicity and cleanliness of Code Architects Regex Tester (aka YART: Yet Another Regex Tester). Shows groups and captures in a tree view, quite fast, very small, open source. It also generates code in C++, VB, and C# and can automatically escape or unescape regexes for these languages. I dump it in my VS tools folder (C:\Program Files\Microsoft Visual Studio 9.0\Common7\Tools) and set a menu item to it in the Tools menu with Tools > External Tools so I can fire it up quickly from inside VS.
Regexes can be really hard to write sometimes and I know it really helps to be able to test the regex and see the results as you go.
(source: dotnet2themax.com)
Another really popular (but not free) option is Regex Buddy.
Upvotes: 3
Reputation: 75222
I think this is the regex you're looking for:
string regex = @"\[" + tag + @":[^\]+]\]";
Also, you don't need to do a separate check to see if there are tags of that type. Just do a regex replace; if there are no matches, the original string is returned.
public static string RemoveTagWithKey(string message, string tagKey) {
string regex = @"\[" + tag + @":[^\]+]\]";
return Regex.Replace(message, regex, string.Empty);
}
You seem to be writing an extension method, but I wrote this as a static utility method to keep things simple.
Upvotes: 1
Reputation: 3386
If you want to do this without a Regex it isn't difficult. You're already searching for a specific tag key, so you can just search for "[" + tagKey, then search from there for the closing "]", and remove everything between those offsets. Something like...
int posStart = message.IndexOf("[" + tagKey + ":");
if(posStart >= 0)
{
int posEnd = message.IndexOf("]", posStart);
if(posEnd > posStart)
{
message = message.Remove(posStart, posEnd - posStart);
}
}
Is that better than a Regex solution? Since you're only looking for a specific key I think it probably is, on the grounds of simplicity. I love Regexes but they're not always the clearest answer.
Edit: Another reason the IndexOf() solution could be seen as better is that it means there is only one rule for finding the start of the tag, whereas the original code uses a Contains()
which searches for something like '[tag:' and then uses a regex which uses a slightly different expression to do the substitution / removal. In theory you could have text which matches one criterion but not the other.
Upvotes: 1
Reputation: 310897
Try this instead:
new Regex(@"\[" + tagKey + @":[^\]+]");
The only thing I changed was to add +
to the [^\]
pattern, meaning that you match one or more characters that are not a backslash.
Upvotes: 1