Reputation: 29
Given a read only array of n + 1 integers between 1 and n, find one number that repeats in linear time using less than O(n) space and traversing the stream sequentially O(1) times.
Sample Input: [3 4 1 4 1]
Sample Output : 1/4(any one of these)
If there are multiple possible answers (like in the sample case above), output any one.
If there is no duplicate, output -1.
I tried doing a solution of this which is:
int Solution::repeatedNumber(const vector<int> &A) {
vector<bool> v(A.size(), true);
for (int i = 0; i < A.size(); i++) {
if (v[A[i]])
v[A[i]] = false;
else
return A[i];
}
}
This is getting accepted but how is this less than O(n) in memory?
Upvotes: 3
Views: 2687
Reputation: 973
solution suggested by @jayson Boubin in above answers is O(1)-space method and it is good(by the way itis awesome!! ), when changing the original array is allowed or mean changing doesn't matters. but if altering the original array is not allowed then the well-known solution is of O(sqrt(n))-space and O(n)-time, and this method basically suggests that we should first consider sqrt(n)-ranges whereas ith range will be [sqrt(n)*i to sqrt(n)*(i+1)] , and after that we traverse the array and count the no. of elements lies in for each range and so onn...
have a look on it: leetcode: find the duplicate number
Upvotes: 0
Reputation: 1
std::vector<bool>
isn't like any other vector.
std::vector<bool>
is a possibly space-efficient specialization of std::vector
for the type bool
.
That's why it may use up less memory because it might represent multiple boolean values with one byte, like a bitset.
Upvotes: 0
Reputation: 18807
I have a solution which requires O(sqrt(N)) space and O(N) time, and traverses the list twice -- assuming it is possible to calculate the integer square root in O(1) time (for arbitrary large N, this is likely at least an O(log(N)) operation).
A1
of size ceil(sqrt(N)), filled with 0.x
k=floor(sqrt(x))
A1[k]
A1[k]>2k+1
, there must be at least one duplicate between k²
and (k+1)²-1
. (For k=floor(sqrt(N))
the threshold is N-k²).
Remember
k` and break first iterationA2
of size 2k+1
filled with false
.x
again:
A2[x-k²]
is set, if yes, x
is a duplicateA2[x-k²]
The solution should also work for larger and smaller arrays (does not need to be exactly N+1), and if there are no duplicates, the first iteration will run to the end. Both temporary arrays are O(k) (if you are pedantic, the first one is O(k*log(k)), since it must store integers up to size sqrt(N)).
Upvotes: 1
Reputation: 1504
You are correct in wondering why this would be accepted. This answer is obvious O(n) space complexity. You allocating some amount of data that grows directly proportionally with n, making it O(n) space. Whatever is judging your program is incorrectly accepting it. It may be possible that the judge is accepting your score because you are using less bytes than are allocated by A, but that is only speculation.
EDIT: The code bellow isn't actually a solution to the problem. It is a solution to a simpler problem along the lines of the above. The solution below ignores the constraint that the stream must be read only. After doing some research, it appears that this problem is a very difficult version of a series of similar problems of the type "Given a range of numbers between 1 and n, find the repeating/missing number". If there were only one number repeated, and there was only a O(n) time requirement, you could use a bool vector as above. If there were only one number repeated, but you were constrained to constant space, you could implement this solution where we use gauss's formula to find the sum of integers from 1 to n, and subtract that from the sum of the array. If the array had two missing numbers, and you were constrained to constant time, you could implement this solution where we use the sum and product of the array to create a system of equations which can be solved in O(n) time with O(1) space.
To solve the question posed above, it looks like one would have to implement something to the order of this monstrosity.
Here is a solution this problem within its constraints:
You could do something like this:
#include<vector>
#include<iostream>
int repeating(std::vector<int>& arr)
{
for (int i = 0; i < arr.size(); i++)
{
if (arr[abs(arr[i])] >= 0)
arr[abs(arr[i])] = -arr[abs(arr[i])];
else {
return abs(arr[i]);
}
}
}
int main()
{
std::vector<int> v{1,2,3,4,5,1};
std::cout<<repeating(v)<<std::endl;
std::cout<<sizeof(v)*sizeof(v[0])<<std::endl;
return 0;
}
The above program uses the input array itself to track duplicates. For each index i, the array evaluates arr[i]. The array sets arr(arr[i]) negative. Negating a value is an easily reversible operation (simply take the absolute value of the element), so it can be used to mark an index of the array without ruining the integrity of the data. If you ever encounter an index such that arr[abs(arr[i])] is negative, you know that you have seen abs(arr[i])) before in the array. This uses O(1) space complexity, traverses the array once, and can be modified to return any or all duplicate numbers.
Upvotes: 4
Reputation: 43970
std::vector<bool>
is a bitset, so it will use n bits. In Big-O notation, O(n/8)=O(n), that means the space is not less than O(n).
I assume they do not look at the actual program, but only measure its space consumption in some example runs. So, using a bit vector tricks it into believing that it is better than O(n).
But I agree with you. It should not be accepted.
Upvotes: 3
Reputation: 55
Well it's constant (O(1)) in memory because you're simply doing a comparison in place, and not creating a new data structure to house anything or for any comparison.
You could also use a Hash Table like unordered_set, but that'd be using O(N) memory - but remain O(N) time complexity.
I'm not entirely sure if this was an "accepted" solution by the way (what you posted, because that is creating a vector of size (sizeofA) - but just offering a solution based on your needs.
Upvotes: -1