Reputation: 9556
I have an array of random numbers lets say:
[10,11,12,14,15,16,17,18,19,20, 11, 12, 14,25,25,26,27,28,29]
I have to detect repeated sequences (witch are in fact errors)
with length bigger then a specific number (2).
Is there any good algorithm for this ?
what I have for now:
int minLenght = 3;
int[] data = {1,2,3};
for(int i = 0; i < data.length; i++){
for(int j = 0; j < data.length; j++){
if ( data[i] == data[j]){
int l = 0;
int ii = i;
int jj = j;
while(data[ii] == data[jj]){
ii++;
jj++;
l++;
}
if(l >= minLenght){
print('['+i+'-'+ii+'] same as ['+j+'-'+jj+']');
}
}
}
}
Upvotes: 4
Views: 3495
Reputation: 37904
I don't know if there is a special algorithm for this, but my proposal would be this:
loop1 over array[i]:
loop2 over array[j] starting with i+1:
dist=array[j]-array[i];
if dist==specific_number:
array_result.append(array[i] +""+""+array[j])
This would be my simple logic.
Upvotes: 0
Reputation: 4772
You could use regexps, except the list format shown is irregular. I use Python below and 'regularise' the list format turning it into a string before applying a regular expression looking for duplicated sequences of numbers/non-numbers:
>>> import re
>>> numbers = [10,11,12,14,15,16,17,18,19,20, 11, 12, 14,25,25,26,27,28,29]
>>> sep = ', '
>>> txt = sep + sep.join(str(x) for x in numbers) + sep
>>> txt
', 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 11, 12, 14, 25, 25, 26, 27, 28, 29, '
>>> re.search(r'\D((?:\d+\D+){2,}).*\1', txt).groups()
('11, 12, 14, ',)
I normally try and minimize my use of regexps but this does detect the duplication.
Upvotes: 0
Reputation: 33509
One way is to store sequences of length L (one bigger than your specific length) in a hash table.
If you ever find a sequence is already in the hash table, then you have found a repeat of length >= L.
e.g. Python code
A=[10,11,12,14,15,16,17,18,19,20,11,12,14,25,25,26,27,28,29]
S=set()
L=2+1
for i in xrange(len(A)-L+1):
key=tuple(A[i:i+L])
if key in S:
print i
else:
S.add(key)
This prints out the locations of repeated sequences with length greater than 2.
Upvotes: 4