Reputation:
I'm currently studying a module called data structures and algorithms at a university. We've been tasked with writing an algorithm that finds the smallest positive integer which does not occur in a given sequence. I was able to find a solution, but is there a more efficient way?
x = [5, 6, 3, 1, 2]
def missing_integer():
for i in range(1, 100):
if i not in x:
return i
print(missing_integer())
The instructions include some examples:
given x = [1, 3, 6, 4, 1, 2], the function should return 5,
given x = [1, 2, 3], the function should return 4 and
given x = [−1, −3], the function should return 1.
Upvotes: 4
Views: 424
Reputation: 1
My answer using list comprehension:
def solution(A):
max_val = max(A);
min_val = min(A);
if max_val<0: val = 1;
elif max_val > 0:
li = [];
[li.append(X) for X in range(min_val,max_val) if X not in A];
if len(li)>0:
if min(li)<0: val = 1;
else: val = min(li);
if len(li)==0: val=max_val+1;
return val;
L = [-1, -3];
res = solution(L);
print(res);
Upvotes: -1
Reputation: 23955
I think the O(n)
algorithm goes like this: initialise an array record of length n + 2
(list in Python) to None
, and iterate over the input. If the element is one of the array indexes, set the element in the record to True
. Now iterate over the new list record starting from index 1. Return the first None
encountered.
Upvotes: 1
Reputation: 4847
The slow step in your algorithm is that line:
if i not in x:
That step takes linear time, which makes the entire algorithm O(N*N)
. If you first turn the list into a set, the lookup is much faster:
def missing_integer():
sx = set(x)
for i in range(1, 100):
if i not in sx:
return i
Lookup in a set is fast, in fact it takes constant time, and the algorithm now runs in linear time O(N).
Upvotes: 0
Reputation: 4772
Can be done in O(n) time with a bit of maths. initialise a minimum_value and maximum_value, and sum_value names then loop once through the numbers to find the minimum and maximum and the sum of all the numbers (mn, mx, sm)
.
Now the sum of integers 0..n = n*(n-1)/2 = s(n)
Therefore: missing_number = (s(mx) - s(mn)) - sm
All done with traversing the numbers only once!
Upvotes: -1
Reputation: 18838
Another solution is creating an array with a size of Max
value, and traverse the array and marking each location of the array when that value is seen. Then, iterate from the start of the array and report the first finding unlabeled location as the smallest missing value. This is done in O(n)
(Fill the array and finding the smallest unlabeled location).
Also, for negative values you can add all values the Min
value to find all values positive. Then, apply the above method.
The space complexity of this method is \Theta(n)
.
To know more, see this post about the implementation and scrutinize more about this method.
Upvotes: 0
Reputation: 22544
You did not ask for the most efficient way to solve the problem, just if there is a more efficient way than yours. The answer to that is yes.
If the missing integer is near the top of the range of the integers and the list is long, your algorithm as a run-time efficiency of O(N**2)
--your loop goes through all possible values, and the not in
operator searches through the entire list if a match is not found. (Your code searches only up to the value 100
--I assume that is just a mistake on your part and you want to handle sequences of any length.)
Here is a simple algorithm that is merely order O(N*log(N))
. (Note that quicker algorithms exist--I show this one since it is simple and thus answers your question easily.) Sort the sequence (which has the order I stated) then run through it starting at the smallest value. This linear search will easily find the missing positive integer. This algorithm also has the advantage that the sequence could involve negative numbers, non-integer numbers, and repeated numbers, and the code could easily handle those. This also handles sequences of any size and with numbers of any size, though of course it runs longer for longer sequences. If a good sort routine is used, the memory usage is quite small.
Upvotes: 2