Reputation: 1198
Alright, I'm warning you in advance, my understanding of Regular Expressions is extremely limited (I've tried my best to learn them over the years, but to be honest, I think they just frighten me.)
Let's say I have the following string:
string keyValues = "CustomerId=1||OrderId=12||UserId=a1dcd568-f129-419b-b51e-be2dbb67de0f"
This string represents key-value pairs, delimited by a user-defined string (in this case ||
) (e.g. key1=value1||key2=value2
). I am trying to extract the keys out of this string and store them in an array. That array would look like this:
{"CustomerId", "OrderId", "UserId"}
The best option I can think of is to use regular expressions (If someone has a better solution, please share). Here's what I'm trying to do:
string delimiter = "||";
string[] keys = Regex.Split(keyValues, "=.*" + delimiter);
I may be wrong, but the way I understand it, that regular expression is supposed to find a string that starts with =
and ends with delimiter
, with any number of any characters in between. Which would split the string at those positions, leaving me with the original keys, but instead, my keys array looks like this:
{"", "C", "u", "s", "t", "o", "m", "e", "r", "I", "d", "", "", ...}
As you can see, the =value||
part is stripped away. Can anyone tell me what I'm doing wrong?
EDIT
In my case, the delimiter ||
is a variable. I didn't mention this only because I thought I would be able to replace any references to ||
with delimiter
. From the majority of the answers given, I now see that that is an important detail.
Upvotes: 0
Views: 1111
Reputation: 368894
|
has special meaning in regular expression (patA|patB
matches either patA
or patB
). Escape |
.
Using non-greedy match (.*?
):
string delimiter = "||";
string[] keys = Regex.Split(keyValues, @"=.*?" + Regex.Escape(delimiter));
This will give you {"CustomerId", "OrderId", "UserId=a1dcd568-f129-419b-b51e-be2dbb67de0f"}
.
Matches
with lookahead assertion is more appropriate:
string delimiter = "||";
string keyValues = "CustomerId=1||OrderId=12||UserId=a1dcd568-f129-419b-b51e-be2dbb67de0f";
string pattern = @"(?<=^|" + Regex.Escape(delimiter) + @")\w+(?==)";
var keys = Regex.Matches(keyValues, pattern);
BTW, use verbatim string literals (@"verbatim string literal"
) when express regular expression.
Upvotes: 3
Reputation: 700152
An alternative is to do this without a regular expression, as the string operations are pretty basic:
string[] keys =
keyValues.Split(new string[]{"||"}, StringSplitOptions.None)
.Select(s => s.Substring(0, s.IndexOf('='))).ToArray();
Keep the regular expressions to the advanced string operations. :)
(When testing the performance of this solution compared to using a regular expression, this showed to be about 40 times faster.)
Upvotes: 1
Reputation:
Split on @"=[^|]*(?:\|\||$)"
If you need more assurance, use @"=[^=|]*(?:\|\||$)"
Edited to consume end where no delimeter exists.
Try to just use no-blank elements if its in C#.
Upvotes: 0
Reputation: 19423
If you just care for the keys, why not try to use a match instead of a split using:
@"[^=|]+(?==)"
If the key can't contain an equal sign =
or a vertical bar |
, then the above expression will match one ore more characters that are not =
or |
which are followed by an equal sign =
, thus matching the keys.
In C#:
var input = "CustomerId=1||OrderId=12||UserId=a1dcd568-f129-419b-b51e-be2dbb67de0f";
var results = Regex.Matches(input, @"[^=|]+(?==)");
Upvotes: 2