user1012732
user1012732

Reputation: 181

How can I split a string depending on its content?

I am trying to parse a string and split it by some delimiters, also including the delimiters.

For example, from the string if(a>b) write(a); I want to get if,(,a,>,b,),write,(,a,),;

Here is what I've tried:

string pattern = "(" + String.Join("|", delimiters.Select(d =>Regex.Escape(d)).ToList()) + ")";
List<string> result = Regex.Split(line, pattern).ToList();

It works, but it fails in some cases. If I had the string if(a>0) write("it is positive"); I would not like to get "it,is,positive" (because space is a delimiter), but "it is positive". How can I do this?

Upvotes: 3

Views: 64

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626709

Matching C strings can be achieved with a known regex:

"[^"\\]*(?:\\.[^"\\]*)*"

See regex demo

To incorporate it into your code, you just need to add the regex to the list of delimiters, but you need to place it as the first alternative in the capturing group.

var delimiters = new List<string> { " ", "(", ")", ">", "<", ",", ";"};
var line = "if(a>b) write(\"My new result\")";
var escaped_delimiters = new List<string>();
escaped_delimiters.Add(@"""[^""\\]*(?:\\.[^""\\]*)*""");
escaped_delimiters.AddRange(delimiters.Select(d => Regex.Escape(d)).ToList());
var pattern = "(" + String.Join("|", escaped_delimiters) + ")";
var result = Regex.Split(line, pattern).Where(x => !String.IsNullOrWhiteSpace(x)).ToList();

See IDEONE demo

If you need no empty elements, use

List<string> result = Regex.Split(line, pattern).Where(x => !string.IsNullOrWhiteSpace(x)).ToList();

The result will be

enter image description here

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174696

I suggest you to do matching instead of splitting using the below regex.

@"(?:""[^""]*""|\w|[^\w\s])+"

Upvotes: 1

Related Questions