Reputation: 13
Here is an example of the string in question:
[952,M] [782,M] [782] {2[373,M]} [1470] [352] [235] [234] {3[610]}{3[380]} [128] [127]
I have added the spaces but it really does not help the breakdown. What I want to do is take each "field" in square brackets and add it to a string list. The next issue which I can handle is some fields also have a comma separated portion that I can split after the fact. The real problem lies in the Curly braces. For instance {2[373,M]}
The number outside the square brackets is a repetition of the square brackets.
For the life of me I can not figure out a way where I can consistently split the line into a string list.
Quasi code follows:
for(i = 0 to string.length)
{
if string.substring(i,1) = "]"
int start1 = i
elseif string.substring(i,1)="["
int start1 = i
elseif string.substring(i,1) = "{"
int start2 = i
elseif string.substring(i,1) = "}"
int end2 = i
}
I thought about using the code idea above to substring out each "field" but the curly braces also contain the square brackets. Any ideas would be greatly appreciated.
Upvotes: 1
Views: 142
Reputation: 528
You can use a regex.
Edited: this manages problem with commas and repetititon:
var regex3 = new Regex(@"(\B\[([a-zA-Z0-9\,]+)\])|(\{(\d+)\[([a-zA-Z0-9\,]+)\]\})");
var stringOne = "[952,M] [782,M] [782] {2[373,M]} [1470] [352] [235] [234] {3[610]}{3[380]} [128] [127]";
var matches = regex.Matches(stringOne);
var listStrings = new List<string>();
foreach (Match match in matches)
{
var repetitor = 1;
string value = null;
if (match.Groups[1].Value == string.Empty)
{
repetitor = int.Parse(match.Groups[4].Value);
value = match.Groups[5].Value;
}
else
{
value = match.Groups[2].Value;
}
var values = value.Split(',');
for (var i = 0; i < repetitor; i++)
{
listStrings.AddRange(values);
}
}
Upvotes: 0
Reputation: 3231
If I understand you correctly, you want to split the characters surrounded by brackets, and when they have curly brackets repeat the content inside the specified number of times.
You can extract all the information you need with a regex, including the number needed to determine the number of times you need to repeat a bracket
var input = @"[952,M] [782,M] [782] {2[373,M]}
[1470] [352] [235] [234] {3[610]}{3[380]} [128] [127]";
var pattern = @"((:?\{(\d+)(.*?)\})|(:?\[.*?\]))";
MatchCollection matches = Regex.Matches(input, pattern);
var ls = new List<string>();
foreach(Match match in matches)
{
// check if the item has curly brackets
// The captures groups will be different if there were curly brackets
// If there are brackets than the 4th capture group
// will have the value of the square brackets and it's content
if( match.Groups[4].Success )
{
var value = match.Groups[4].Value;
// The "Count" of the items will
// be in the third capture group
var count = int.Parse(match.Groups[3].Value);
for(int i=0;i<count;i++)
{
ls.Add(value);
}
}
else
{
// otherwise we know that square bracket input
// is in the first capture group
ls.Add(match.Groups[1].Value);
}
}
Here is a working fiddle of the solution: https://dotnetfiddle.net/4rQsDj
Here is the output :
[952,M]
[782,M]
[782]
[373,M]
[373,M]
[1470]
[352]
[235]
[234]
[610]
[610]
[610]
[380]
[380]
[380]
[128]
[127]
If you don't want the brackets can get rid of them by changing the regex pattern to (:?(:?\{(\d+)\[(.*?)\]\})|(:?\[(.*?)\]))
, and match.Groups[1].Value
to match.Groups[6].Value
.
Here is the working solution without square brackets: https://dotnetfiddle.net/OQwStf
Upvotes: 1
Reputation: 22876
var s = "[952,M] [782,M] [782] {2[373,M]} [1470] [352] [235] [234] {3[610]}{3[380]} [128] [127]";
var s2 = Regex.Replace(s, @"\{(\d+)(\[[^]]+\])\}", m => string.Concat(
Enumerable.Repeat(m.Groups[2].Value, int.Parse(m.Groups[1].Value))));
var a = s2.Split("[] ".ToArray(), StringSplitOptions.RemoveEmptyEntries);
// s2 = "[952,M] [782,M] [782] [373,M][373,M] [1470] [352] [235] [234] [610][610][610][380][380][380] [128] [127]"
// a = {"952,M","782,M","782","373,M","373,M","1470","352","235","234","610","610","610","380","380","380","128","127"}
Upvotes: 1
Reputation: 67193
While you might be able to get by on RegEx, it may come up short if your needs grow too complex. So the code below shows the general approach I would take to accomplish this. It's a little quick and dirty but meets your requirements.
In addition, I have a parsing helper class that would make this code easier to write and more robust.
string input = "[952,M] [782,M] [782] {2[373,M]} [1470] [352] [235] [234] {3[610]}{3[380]} [128] [127]";
int pos = 0;
void Main()
{
while (pos < input.Length)
{
SkipWhitespace();
if (pos < input.Length && input[pos] == '{')
ParseBrace();
else if (pos < input.Length && input[pos] == '[')
ParseBracket();
}
}
void SkipWhitespace()
{
while (pos < input.Length && char.IsWhiteSpace(input[pos]))
pos++;
}
void ParseBrace()
{
Debug.Assert(pos < input.Length && input[pos] == '{');
int pos2 = input.IndexOf('[', pos + 1);
if (pos2 < 0)
pos2 = input.Length;
int count = int.Parse(input.Substring(pos + 1, pos2 - pos - 1));
for (int i = 0; i < count; i++)
{
pos = pos2;
ParseBracket();
}
pos2 = input.IndexOf('}', pos2 + 1);
if (pos2 < 0)
pos2 = input.Length;
pos = pos2 + 1;
}
void ParseBracket()
{
Debug.Assert(pos < input.Length && input[pos] == '[');
int pos2 = input.IndexOf(']', pos + 1);
if (pos2 < 0)
pos2 = input.Length;
Console.WriteLine(input.Substring(pos + 1, pos2 - pos - 1));
pos = pos2 + 1;
}
Sample output:
952,M
782,M
782
373,M
373,M
1470
352
235
234
610
610
610
380
380
380
128
127
Upvotes: 1
Reputation: 3290
The regex below will handle both situations:
(?:\{([^\[]+)){0,1}\[([^\]]+)\]\}{0,1}
For matches for your case without the curly braces, the first match will be empty. For the second case, the first match will contain your number of repeats. In both cases, the second match will contain the actual data. The link below shows a demo of this working:
Note, however, that you will have to handle the repetition yourself in the code that makes use of the regex
Upvotes: 1