Reputation: 2285
When parsing FTX (free text) string, I need to split it using +
as a delimiter, but only when it's not preceded by escape character (say, ?
).
So this string nika ?+ marry = love+sandra ?+ alex = love
should be parsed to two strings: nika + marry = love
and sandra + alex = love
.
Using String.Split('+')
is obviously not enough. Can I achieve it somehow?
One way, it seems to me, is to replace occurrences of ?+
with some unique character (or a succession of characters), say, @#@
, split using "+" as a delimiter and then replace @#@
back to +
, but that's unreliable and wrong in any possible way I can think of.
?
is used as an escape character only in combination with either :
or +
, in any other case it's viewed as a regular character.
Upvotes: 5
Views: 181
Reputation: 342
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string s = "nika ?+ marry = love+sandra ?+ alex = love";
string[] result = Regex.Split(s, "\\?{0}\\+", RegexOptions.Multiline);
s = String.Join("\n", result);
Regex rgx = new Regex("\\?\\n");
s = rgx.Replace(s, "+");
result = Regex.Split(s, "\\n", RegexOptions.Multiline);
foreach (string match in result)
{
Console.WriteLine("'{0}'", match);
}
}
}
Outputs
'nika + marry = love'
'sandra + alex = love'
See https://dotnetfiddle.net/HkcQUw
Upvotes: 1
Reputation: 111840
An horrible regular expression to split it:
string str = "nika ?+ marry = love??+sandra ???+ alex = love";
string[] splitted = Regex.Split(str, @"(?<=(?:^|[^?])(?:\?\?)*)\+");
It splits on a + (\+
) that is preceded by the beginning of the string (^
) or a non-?
character ([^?]
) plus an even number of ?
((?:\?\?)*
). There is a liberal use of the (?:)
(non-capturing groups) because Regex.Split
does funny things if there are multiple capturing groups.
Note that I'm not doing the unescape! So in the end ?+
remains ?+
.
Upvotes: 3