Reputation: 355
Lets say we are given an array A[]
of length N
and we have to answer Q
queries which consists of two integers L,R
. We have to find the number from A[L]
to A[R]
which has its frequency at least (R-L+1)/2
. If such number doesn't exist then we have to print "No such number"
I could think of only O(Q*(R-L))
approach of running a frequency counter and first obtaining the most frequent number in the array from L to R
. Then count its frequency.
But more optimization is needed.
Constraints: 1<= N <= 3*10^5
, ,1<=Q<=10^5
,1<=L<=R<=N
Upvotes: 3
Views: 2840
Reputation: 18576
I know an O((N + Q) * sqrt(N))
solution:
Let's call a number heavy if at occurs at least B
times in the array. There are at most N / B
heavy numbers in the array.
If the query segment is "short" (R - L + 1 < 2 * B
), we can answer it in O(B)
time (by simply iterating over all elements of the range).
If the query segment is "long" (R - L + 1 >= 2 * B
), a frequent element must be heavy. We can iterate over all heavy numbers and check if at least one then fits (to do that, we can precompute prefix sums of number of occurrences for each heavy element and find the number of its occurrences in a [L, R]
segment in constant time).
If we set B = C * sqrt(N)
for some constant C
, this solution runs in O((N + Q) * sqrt(N))
time and uses O(N * sqrt(N))
memory. With properly chosen C
, and may fit into time and memory limit.
There is also a randomized solution which runs in O(N + Q * log N * k)
time.
Let's store a vector of position of occurrences for each unique element in the array. Now we can find the number of occurrences of a fixed element in a fixed range in O(log N)
time (two binary searches over the vector of occurrences).
For each query, we'll do the following:
O(log N)
time as described above1 / 2
for each trial. If we do it k
times, the probability not to find it is (1 / 2) ^ k
With a proper choice of k
(so that O(k * log N)
per query is fast enough and (1 / 2) ^ k
is reasonably small), this solution should pass.
Both solutions are easy to code (the first just needs prefix sums, the second only uses a vector of occurrences and binary search). If I had to code one them, I'd pick the latter (the former can be more painful to squeeze in time and memory limit).
Upvotes: 4