Reputation: 1945
I'm currently replacing a very old (and long) C#
string parsing class that I think could be condensed into a single regex statement. Being a newbie to Regex, I'm having some issues getting it working correctly.
Description of the possible input strings:
The input string can have up to three words separated by spaces. It can stop there, or it can have an =
followed by more words (any amount) separated by a comma. The words can also be contained in quotes. If a word is in quotes and has a space, it should NOT be split by the space.
Examples of input and expected output elements in the string array:
Input1:
this is test
Output1:
{"this", "is", "test"}
Input2:this is test=param1,param2,param3
Output2: {"this", "is", "test", "param1", "param2", "param3"}
Input3:use file "c:\test file.txt"=param1 , param2,param3
Output3: {"use", "file", "c:\test file.txt", "param1", "param2", "param3"
}
Input4:log off
Output4: {"log", "off"}
And the most complex one:
Input5:
use object "c:\test file.txt"="C:\Users\layer.shp" | ( object = 10 ),param2
Output5:
{"use", "object", "c:\test file.txt", "C:\Users\layer.shp | ( object = 10 )", "param2"}
So to break this down:
=
, ignore the =
and split by commas instead.Here's the closest regex I've got:
\w+|"[\w\s\:\\\.]*"+([^,]+)
This seems to split the string based on spaces, and by commas after the =
. However, it seems to include the =
for some reason if one of the first three words is surrounded by quotes. Also, I'm not sure how to split by space only up to the first three words in the string, and the rest by comma if there is an =
.
It looks like part of my solution is to use quantifiers with {}
, but I've unable to set it up properly.
Upvotes: 1
Views: 125
Reputation: 34421
Without Regex. Regex should be used when string methods cannot be used. :
string[] inputs = {
"this is test",
"this is test=param1,param2,param3",
"use file \"c:\\test file.txt\"=param1 , param2,param3",
"log off",
"use object \"c:\\test file.txt\"=\"C:\\Users\\layer.shp\" | ( object = 10 ),param2"
};
foreach (string input in inputs)
{
List<string> splitArray;
if (!input.Contains("="))
{
splitArray = input.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
}
else
{
int equalPosition = input.IndexOf("=");
splitArray = input.Substring(0, equalPosition).Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
string end = input.Substring(equalPosition + 1);
splitArray.AddRange(end.Split(new char[] { ',' }).ToList());
}
string output = string.Join(",", splitArray.Select(x => x.Contains("\"") ? x : "\"" + x + "\""));
Console.WriteLine(output);
}
Console.ReadLine();
Upvotes: 2