Reputation: 21
Does anybody know an efficient algorithm for traveling through a sequence of digits by looking for a certain combination, e.g.:
There is this given sequence and I want to find the index of a certain combination of 21??73 in e.g.
... 124321947362862188734738 ...
So I have a pattern 21??94 and need to find out where is the index of:
219473
218873
I assume that there is way to not touch every single digit.
EDIT: "Lasse V. Karlsen" has brought up an important point that I did forget. There is no overlapping allowed, e.g.
21217373215573
212173 is ok, then the next would be 215573
Upvotes: 2
Views: 87
Reputation: 178421
Seems like you are looking for the regular expression 21..73
- .
stands for "any character"1
Next you just need iterate all matches of this regex.
Most high level languages already have a regex library built in that is simple and easy to use for such tasks.
Note that many regex libraries already take care of "no overlapping" for you, including java:
String s = "21217373215573";
Matcher m = Pattern.compile("21..73").matcher(s);
while (m.find()) System.out.println(m.group());
Will yield the required output of:
212173
215573
(1) This assumes your sequence is of digits in the first place, as your question implies.
Upvotes: 2
Reputation: 133
Since you dont know where these combinations start and you are not looking just for the first one, there is no way to not touch each digit (maybe just last n-1 digits, where n is length of combination, because if there is less numbers, there is not enough space).
I just dont know better way then just read whole sequence, because you can have
... 84452121737338494684 ...
and then you have two combinations overlapping. If you are not looking for overlapping combinations, it's just easier version, but it is possibility in your example.
Some non-overlap algorithm pseudo-code:
start := -1; i := 0
for each digit in sequence
if sequence[digit] = combination[i]
if start = -1
start := digit
endif
i++
if i >= length(combination)
possibleCombinations.add(start)
start := -1
i := 0
endif
else
start := -1
endif
end
This should be O(n). Same complexity as looking for one value in unsorted array. If you are looking for overlapping combinations like in my example, then complexity is a bit higher and you have to check each possible start, which add one loop inside checking each found start value. Something that check if combination continue, then leave start value or discarding it when combination is broken. Then complexity will be something like O(n*length(combination)), because there cannot be more starts, then what is length of combination.
Upvotes: 0
Reputation: 52185
Depending on what language you are using, you could use regular expressions of the sort 21\d{2}73
which will look for 21
, followed by two digits which are in turn followed by 73
. Languages such as C# allow you to get the index of the match, as shown here.
Alternatively, you could construct your own Final State Machine which could be something of the sort:
string input = ...
int index = 0
while(index < input.length - 5)
if(input[index] == 2) && (input[index + 1] == 1) && (input[index + 4] == 7) && (input[index + 5] == 3)
print(index);
index += 6;
else index++
Upvotes: 1