Jeff
Jeff

Reputation: 13199

Find the first occurrence/starting index of the sub-array in C#

Given two arrays as parameters (x and y) and find the starting index where the first occurrence of y in x. I am wondering what the simplest or the fastest implementation would be.

Example:

when x = {1,2,4,2,3,4,5,6}
     y =       {2,3}
result
     starting index should be 3

Update: Since my code is wrong I removed it from the question.

Upvotes: 7

Views: 11517

Answers (10)

Vadim Nazarenko
Vadim Nazarenko

Reputation: 29

public static bool Includes<T>(this IReadOnlyList<T> haystack, IReadOnlyList<T> needle, int start = 0)
{
    var h = start;
    var n = 0;
    var res = false;
    while (h < haystack.Count && n < needle.Count)
    {
        res = haystack[h].Equals(needle[n]);
        if (res)
        {
            h++;
            n++;
        }
        else
        {
            h = ++start;
            n = 0;
        }
    }
    return res && n == needle.Count;
}

Upvotes: 0

public class Test
{
    static void Main()
    {
        int[] x = { 1, 2, 4, 2, 3, 4, 5, 6 };
        int[] y = { 2, 3 };

        int index = new ReadOnlySpan<int>(x).IndexOf(y);

        if (index < 0)
            Console.WriteLine("Subarray not found");
        else
            Console.WriteLine("Subarray found at index " + index);
    }
}

Upvotes: 0

abelenky
abelenky

Reputation: 64682

using System;
using System.Linq;

public class Test
{
    public static void Main()
    {
        int[] x = {1,2,4,2,3,4,5,6};
        int[] y =       {2,3};
        int? index = null;

        for(int i=0; i<x.Length; ++i)
        {
            if (y.SequenceEqual(x.Skip(i).Take(y.Length)))
            {
                index = i;
                break;
            }
        }
        Console.WriteLine($"{index}");
    }
}

Output

3

Upvotes: 0

ibrahim_ragab
ibrahim_ragab

Reputation: 11

    //this is the best in C#

    //bool contains(array,subarray)
    //  when find (subarray[0])
    //      while subarray[next] IS OK
    //          subarray.end then Return True
    public static bool ContainSubArray<T>(T[] findIn, out int found_index,
 params T[]toFind)
    {
        found_index = -1;
        if (toFind.Length < findIn.Length)
        {

            int index = 0;
            Func<int, bool> NextOk = (i) =>
                {
                    if(index < findIn.Length-1)
                        return findIn[++index].Equals(toFind[i]);
                    return false;
                };
            //----------
            int n=0;
            for (; index < findIn.Length; index++)
            {
                if (findIn[index].Equals(toFind[0]))
                {
                    found_index=index;n=1;
                    while (n < toFind.Length && NextOk(n))
                        n++;
                }
                if (n == toFind.Length)
                {
                    return true;
                }
            }

        }
        return false;
    }

Upvotes: 0

Scott Chamberlain
Scott Chamberlain

Reputation: 127563

This is based off of Mark Gravell's answer but I made it generic and added some simple bounds checking to keep exceptions from being thrown

private static bool IsSubArrayEqual<T>(T[] source, T[] compare, int start) where T:IEquatable<T>
{
    if (compare.Length > source.Length - start)
    {
        //If the compare string is shorter than the test area it is not a match.
        return false;
    }

    for (int i = 0; i < compare.Length; i++)
    {
        if (source[start++].Equals(compare[i]) == false) return false;
    }
    return true;
}

Could be improved further by implementing Boyer-Moore but for short patterns it works fine.

Upvotes: 4

Nikolas Stephan
Nikolas Stephan

Reputation: 1290

I find something along the following lines more intuitive, but that may be a matter of taste.

public static class ArrayExtensions
{
    public static int StartingIndex(this int[] x, int[] y)
    {
        var xIndex = 0;
        while(xIndex < x.length)
        {
            var found = xIndex;
            var yIndex = 0;
            while(yIndex < y.length && xIndex < x.length && x[xIndex] == y[yIndex])
            {
                xIndex++;
                yIndex++;
            }

            if(yIndex == y.length-1)
            {
                return found;
            }

            xIndex = found + 1;
        }

        return -1;
    }
}

This code also addresses an issue I believe your implementation may have in cases like x = {3, 3, 7}, y = {3, 7}. I think what would happen with your code is that it matches the first number, then resets itself on the second, but starts matching again on the third, rather than stepping back to the index just after where it started matching. May be missing something, but it's definitely something to consider and should be easily fixable in your code.

Upvotes: 1

Guffa
Guffa

Reputation: 700342

Here is a simple (yet fairly efficient) implementation that finds all occurances of the array, not just the first one:

static class ArrayExtensions {

  public static IEnumerable<int> StartingIndex(this int[] x, int[] y) {
    IEnumerable<int> index = Enumerable.Range(0, x.Length - y.Length + 1);
    for (int i = 0; i < y.Length; i++) {
      index = index.Where(n => x[n + i] == y[i]).ToArray();
    }
    return index;
  }

}

Example:

int[] x = { 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4 };
int[] y = { 2, 3 };
foreach (int i in x.StartingIndex(y)) {
  Console.WriteLine(i);
}

Output:

1
5
9

The method first loops through the x array to find all occurances of the first item in the y array, and place the index of those in the index array. Then it goes on to reduce the matches by checking which of those also match the second item in the y array. When all items in the y array is checked, the index array contains only the full matches.

Edit:
An alternative implementation would be to remove the ToArray call from the statement in the loop, making it just:

index = index.Where(n => x[n + i] == y[i]);

This would totally change how the method works. Instead of looping through the items level by level, it would return an enumerator with nested expressions, deferring the search to the time when the enumerator was iterated. That means that you could get only the first match if you wanted:

int index = x.StartingIndex(y).First();

This would not find all matches and then return the first, it would just search until the first was found and then return it.

Upvotes: 7

Eric Lippert
Eric Lippert

Reputation: 660088

"Simplest" and "fastest" are opposites in this case, and besides, in order to describe fast algorithms we need to know lots of things about how the source array and the search array are related to each other.

This is essentially the same problem as finding a substring inside a string. Suppose you are looking for "fox" in "the quick brown fox jumps over the lazy dog". The naive string matching algorithm is extremely good in this case. If you are searching for "bananananananananananananananana" inside a million-character string that is of the form "banananananabanananabananabananabanananananbananana..." then the naive substring matching algorithm is terrible -- far faster results can be obtained by using more complex and sophisticated string matching algorithms. Basically, the naive algorithm is O(nm) where n and m are the lengths of the source and search strings. There are O(n+m) algorithms but they are far more complex.

Can you tell us more about the data you're searching? How big is it, how redundant is it, how long are the search arrays, and what is the likelihood of a bad match?

Upvotes: 2

Marc Gravell
Marc Gravell

Reputation: 1062810

Simplest to write?

    return (from i in Enumerable.Range(0, 1 + x.Length - y.Length)
            where x.Skip(i).Take(y.Length).SequenceEqual(y)
            select (int?)i).FirstOrDefault().GetValueOrDefault(-1);

Not quite as efficient, of course... a bit more like it:

private static bool IsSubArrayEqual(int[] x, int[] y, int start) {
    for (int i = 0; i < y.Length; i++) {
        if (x[start++] != y[i]) return false;
    }
    return true;
}
public static int StartingIndex(this int[] x, int[] y) {
    int max = 1 + x.Length - y.Length;
    for(int i = 0 ; i < max ; i++) {
        if(IsSubArrayEqual(x,y,i)) return i;
    }
    return -1;
}

Upvotes: 8

Mark Byers
Mark Byers

Reputation: 838226

The simplest way is probably this:

public static class ArrayExtensions
{
    private static bool isMatch(int[] x, int[] y, int index)
    {
        for (int j = 0; j < y.Length; ++j)
            if (x[j + index] != y[j]) return false;
        return true;
    }

    public static int IndexOf(this int[] x, int[] y)
    {
        for (int i = 0; i < x.Length - y.Length + 1; ++i)
            if (isMatch(x, y, i)) return i;
        return -1;
    }
}

But it's definitely not the fastest way.

Upvotes: 4

Related Questions