pango89
pango89

Reputation: 295

Create array of strings from a larger string after delimiting

I have a string input which looks like this var input = "AB-PQ-EF=CD-IJ=XY-JK". I want to know if there is a way using string.split() method in C# and LINQ such that I can get an array of strings which looks like this var output = ["AB-PQ", "PQ-EF", "EF=CD", "CD-IJ", "IJ=XY", "XY-JK"]. Currently I am doing the same conversion manually by iterating the input string.

Upvotes: 5

Views: 165

Answers (7)

user3213856
user3213856

Reputation: 21

        string input = "AB-PQ-EF=CD-IJ=XY-JK";
        var result = new Regex(@"(?<![A-Z])(?=([A-Z]+[=-][A-Z]+))").Matches(input)
            .Cast<Match>().Select(m => m.Groups[1].Value).ToArray();
        foreach (var item in result)
        {
            Console.WriteLine(item);
        }

Upvotes: 0

qxg
qxg

Reputation: 7036

Recently learning Haskell, so here is a recursive solution.

static IEnumerable<string> SplitByPair(string input, char[] delimiter)
{
    var sep1 = input.IndexOfAny(delimiter);
    if (sep1 == -1)
    {
        yield break;
    }
    var sep2 = input.IndexOfAny(delimiter, sep1 + 1);
    if (sep2 == -1)
    {
        yield return input;
    }
    else
    {
        yield return input.Substring(0, sep2);
        foreach (var other in SplitByPair(input.Substring(sep1 + 1), delimiter))
        {
            yield return other;
        }
    }
}

Good things are

  • It's lazy
  • Easy to extend to other conditions and other data types. However, it's a little hard in C# because C# lacks of Haskell's List.span and pattern match.

Upvotes: 0

atoms
atoms

Reputation: 3093

Could you adapt something like this? Just need to change the factorization.

        List<string> lsOut = new List<string>() { };

        string sInput = "AB-PQ-EF=CD-IJ=XY-JK";
        string sTemp = "";


        for (int i = 0; i < sInput.Length; i++)
        {

            if ( (i + 1) % 6 == 0)
            {
                continue;
            }

            // add to temp
            sTemp += sInput[i];

            // multiple of 5, add all the temp to list
            if ( (i + 1 - lsOut.Count) % 5 == 0)
            {
                lsOut.Add(sTemp);
                sTemp = "";
            }

            if(sInput.Length == i + 1)
            {
                lsOut.Add(sTemp);
            }

        }

Upvotes: 0

Lucifer
Lucifer

Reputation: 1594

You can try below approach: Here we will split the string based on special chars.Then we will loop over the elements and select until next char group. ex: Get AB and get values till PQ

        string valentry = "AB-PQ-EF=CD-IJ=XY-JK";
        List<string> filt = Regex.Split(valent, @"[\-|\=]").ToList();

        var listEle = new List<string>();
        fil.ForEach(x => 
            {
                if (valentry .IndexOf(x) != valentry .Length - 2)
                {
                    string ele = valentry.Substring(valentry .IndexOf(x), 5);
                    if (!String.IsNullOrEmpty(ele))
                        listEle.Add(ele);
                }
            });

enter image description here

Upvotes: 0

JonMac1374
JonMac1374

Reputation: 466

For a solution using string.Split and LINQ, we just need to track the length of each part as we go so that the separator can be pulled from the original string, like so:

var input = "ABC-PQ-EF=CDED-IJ=XY-JKLM";

var split = input.Split('-', '=');

int offset = 0;

var result = split
            .Take(split.Length - 1)
            .Select((part, index) => {
                offset += part.Length;
                return $"{part}{input[index + offset]}{split[index + 1]}";})
            .ToArray();

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520878

Here is a working script. If you had a constant fixed delimiter, you'd only be looking at a single call to Regex.split. Your original string doesn't have that, but we can easily enough make some duplications in that input such that the string becomes splittable.

string input = "ABC-PQ-EF=CD-IJ=XYZ-JK";
string s = Regex.Replace(input, @"((?<=[=-])[A-Z]+(?=[=-]))", "$1~$1");
Console.WriteLine(s);
var items = Regex.Split(s, @"(?<=[A-Z]{2}[=-][A-Z]{2})[~]");
foreach (var item in items)
{
    Console.WriteLine(item);
}

ABC-PQ~PQ-EF~EF=CD~CD-IJ~IJ=XYZ~XYZ-JK
ABC-PQ
PQ-EF
EF=CD
CD-IJ
IJ=XYZ
XYZ-JK

Demo

If you look closely at the very first line of the output above, you'll see the trick I used. I just connected the pairs you want via a different delimiter (ideally ~ does not appear anywhere else in your string). Then, we just have to split by that delimiter.

Upvotes: 0

Rawling
Rawling

Reputation: 50104

Can you use a regex instead of split?

var input = "AB-PQ-EF=CD-IJ=XY-JK";
var pattern = new Regex(@"(?<![A-Z])(?=([A-Z]+[=-][A-Z]+))");
var output = pattern.Matches(input).Cast<Match>().Select(m => m.Groups[1].Value).ToArray();

Upvotes: 7

Related Questions