Reputation: 181
I am trying to parse a string and split it by some delimiters, also including the delimiters.
For example, from the string if(a>b) write(a);
I want to get if
,(
,a
,>
,b
,)
,write
,(
,a
,)
,;
Here is what I've tried:
string pattern = "(" + String.Join("|", delimiters.Select(d =>Regex.Escape(d)).ToList()) + ")";
List<string> result = Regex.Split(line, pattern).ToList();
It works, but it fails in some cases. If I had the string if(a>0) write("it is positive");
I would not like to get "it
,is
,positive"
(because space is a delimiter), but "it is positive"
. How can I do this?
Upvotes: 3
Views: 64
Reputation: 626709
Matching C strings can be achieved with a known regex:
"[^"\\]*(?:\\.[^"\\]*)*"
See regex demo
To incorporate it into your code, you just need to add the regex to the list of delimiters, but you need to place it as the first alternative in the capturing group.
var delimiters = new List<string> { " ", "(", ")", ">", "<", ",", ";"};
var line = "if(a>b) write(\"My new result\")";
var escaped_delimiters = new List<string>();
escaped_delimiters.Add(@"""[^""\\]*(?:\\.[^""\\]*)*""");
escaped_delimiters.AddRange(delimiters.Select(d => Regex.Escape(d)).ToList());
var pattern = "(" + String.Join("|", escaped_delimiters) + ")";
var result = Regex.Split(line, pattern).Where(x => !String.IsNullOrWhiteSpace(x)).ToList();
See IDEONE demo
If you need no empty elements, use
List<string> result = Regex.Split(line, pattern).Where(x => !string.IsNullOrWhiteSpace(x)).ToList();
The result will be
Upvotes: 2
Reputation: 174696
I suggest you to do matching instead of splitting using the below regex.
@"(?:""[^""]*""|\w|[^\w\s])+"
Upvotes: 1