Reputation: 4488
I'm trying to write a regex that will extract clean values from a delimited list. The catch is that the list could be delimited by different symbols or words. The captured values will be trimmed in the code, so spaces don't matter.
Input:
English (UK), French* , German and Polish & Russian; Portugese and Italian
Regex I have so far:
\A(?:(?<Value>[^,;&*]+)[,;&\s*]*)*\Z
The delimiters I'm expecting are ,;&
. I included the *
because I want it excluded from the captured value.
Captured values:
English (UK), French, German and Polish, Russian, Portugese and Italian
Expected values:
English (UK), French, German, Polish, Russian, Portugese, Italian
The problem I have is that I can't get and
to be treated as a delimiter.
Upvotes: 0
Views: 163
Reputation: 55609
This is what I came up with:
\A(?:(?<Value>(?:[^,;&*\s]|\s(?!and))+)(?:(?:and|[,;&\s*])*))*\Z
Explanation:
(?:...)
is a non-capturing group, not changing the match, just not storing the result in a group.
(?!...)
is negative lookahead, matching if the characters following don't match the given pattern.
Basically this only matches white-space as part of Value
if "and" doesn't follow it, and it includes "and" in the separator.
This seems awfully complicated, you may want replace " and "
with a separator and use your current expression.
Test.
Upvotes: 1
Reputation:
Or just do this to your current result:
desiredResult = currentResult.Replace("and", ",");
Upvotes: 0
Reputation: 4892
I think it is not necessary to use Regex here:
string str = "English (UK), French* , German and Polish & Russian; Portugese and Italian";
string[] results = str.Split(new string[] { ",", ";", "&", "*" }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in results)
if (!string.IsNullOrWhiteSpace(s))
Console.WriteLine(s);
Upvotes: 1