Reputation: 24737
Say I have a large byte[]
and I'm not only looking to see if, but also where, a smaller byte[]
is in the larger array. For example:
byte[] large = new byte[100];
for (byte i = 0; i < 100; i++) {
large[i] = i;
}
byte[] small = new byte[] { 23, 24, 25 };
int loc = large.IndexOf(small); // this is what I want to write
I guess I'm asking about looking for a sequence of any type (primitive or otherwise) within a larger sequence.
I faintly remember reading about a specific approach to this in strings, but I don't remember the name of the algorithm. I could easily write some way to do this, but I know there's a good solution and it's on the tip of my tongue. If there's some .Net method that does this, I'll take that too (although I'd still appreciate the name of the searching algorithm for education's sake).
Upvotes: 3
Views: 708
Reputation: 726709
You can do it with LINQ, like this:
var res = Enumerable.Range(0, large.Length-1)
.Cast<int?>()
.FirstOrDefault(n => large.Skip(n.Value).Take(small.Length).SequenceEqual(small));
if (res != null) {
Console.Println("Found at {0}", res.Value);
} else {
Console.Println("Not found");
}
The approach is self-explanatory except for the Cast<int?>
part: you need it to decide between finding the result at the initial location of the large
array, when zero is returned, and not finding the result at all, when the return is null
.
Here is a demo on ideone.
The complexity of the above is O(M*N)
, where M
and N
are lengths of the large
and small
arrays. If the large
array is very long, and contains a significant number of "almost right" sub-sequences that match long prefixes of small
, you may be better off implementing an advanced algorithm for searching sequences, such as the Knuth–Morris–Pratt (KMP) algorithm. The KMP algorithm speeds up the search by making an observation that when a mismatch occurs, the small
sequence contains enough information on how far ahead you can move in the large
sequence based on where in the small sequence is the first mismatch. A look-up table is prepared for the small
sequence, and then that table is used throughout the search to decide how to advance the search point. The complexity of KMP is O(N+M)
. See the Wikipedia article linked above for pseudocode of the KMP algorithm.
Upvotes: 4
Reputation: 1289
Are you thinking of Lambda expressions? That is what came to my mind when you said a more specific approach with strings.
http://www.dotnetperls.com/array-find
Upvotes: 0