User48591
User48591

Reputation: 301

How to split a string on the nth occurrence?

What I want to do is to split on the nth occurrence of a string (in this case it's "\t"). This is the code I'm currently using and it splits on every occurrence of "\t".

string[] items = input.Split(new char[] {'\t'}, StringSplitOptions.RemoveEmptyEntries);

If input = "one\ttwo\tthree\tfour", my code returns the array of:

But let's say I want to split it on every "\t" after the second "\t". So, it should return:

Upvotes: 12

Views: 17901

Answers (4)

Oded
Oded

Reputation: 499152

There is nothing built in.

You can use the existing Split, use Take and Skip with string.Join to rebuild the parts that you originally had.

string[] items = input.Split(new char[] {'\t'}, 
                             StringSplitOptions.RemoveEmptyEntries);
string firstPart = string.Join("\t", items.Take(nthOccurrence));
string secondPart = string.Join("\t", items.Skip(nthOccurrence))

string[] everythingSplitAfterNthOccurence = items.Skip(nthOccurrence).ToArray();

An alternative is to iterate over all the characters in the string, find the index of the nth occurrence and substring before and after it (or find the next index after the nth, substring on that etc... etc... etc...).

Upvotes: 19

Matthew Watson
Matthew Watson

Reputation: 109762

[EDIT] After re-reading the edited OP, I realise this doesn't do what is now asked. This will split on every nth target; the OP wants to split on every target AFTER the nth one.

I'll leave this here for posterity anyway.


If you were using the MoreLinq extensions you could take advantage of its Batch method.

Your code would then look like this:

string text = "1\t2\t3\t4\t5\t6\t7\t8\t9\t10\t11\t12\t13\t14\t15\t16\t17";

var splits = text.Split('\t').Batch(5);

foreach (var split in splits)
    Console.WriteLine(string.Join("", split));

I'd probably just use Oded's implementation, but I thought I'd post this for an alternative approach.

The implementation of Batch() looks like this:

public static class EnumerableExt
{
    public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(this IEnumerable<TSource> source, int size)
    {
        TSource[] bucket = null;
        var count = 0;

        foreach (var item in source)
        {
            if (bucket == null)
                bucket = new TSource[size];

            bucket[count++] = item;

            if (count != size)
                continue;

            yield return bucket;

            bucket = null;
            count = 0;
        }

        if (bucket != null && count > 0)
            yield return bucket.Take(count);
    }
}

Upvotes: 4

Ravindra Shekhawat
Ravindra Shekhawat

Reputation: 4353

// Return a substring of str upto but not including
// the nth occurence of substr
function getNth(str, substr, n) {
  var idx;
  var i = 0;
  var newstr = '';
  do {
    idx = s.indexOf(c);
    newstr += str.substring(0, idx);
    str = str.substring(idx+1);
  } while (++i < n && (newstr += substr))
  return newstr;
}

Upvotes: 0

MoonKnight
MoonKnight

Reputation: 23831

It is likely that you will have to split and re-combine. Something like

int tabIndexToRemove = 3;
string str = "My\tstring\twith\tloads\tof\ttabs";
string[] strArr = str.Split('\t');
int numOfTabs = strArr.Length - 1;
if (tabIndexToRemove > numOfTabs)
    throw new IndexOutOfRangeException();
str = String.Empty;
for (int i = 0; i < strArr.Length; i++)
    str += i == tabIndexToRemove - 1 ? 
        strArr[i] : String.Format("{0}\t", strArr[i]);

Result:

My string withloads of tabs

I hope this helps.

Upvotes: 1

Related Questions