Ori Benami
Ori Benami

Reputation: 11

Why solving a range minimum query with segment tree time complexity is O(Log n)?

i was trying to solve how to find in a given array and two indexes the minimum value between these two indexes in O(Log(n)). i saw the solution of using a segment-tree but couldn't understand why the time complexity for this solution is O(Logn) because it doesnt seems like this because if your range is not exactly within the nod's range you need to start spliting the search.

Upvotes: 1

Views: 463

Answers (1)

Erfan Alimohammadi
Erfan Alimohammadi

Reputation: 480

First proof:

The claim is that there are at most 2 nodes which are expanded at each level. We will prove this by contradiction.

Consider the segment tree given below.

enter image description here

Let's say that there are 3 nodes that are expanded in this tree. This means that the range is from the left most colored node to the right most colored node. But notice that if the range extends to the right most node, then the full range of the middle node is covered. Thus, this node will immediately return the value and won't be expanded. Thus, we prove that at each level, we expand at most 2 nodes and since there are logn levels, the nodes that are expanded are 2⋅logn=Θ(logn).

Source

Second proof:

There are four cases when query the interval (x,y)

FIND(R,x,y) //R is the node
% Case 1
    if R.first = x and R.last = y   
        return {R}
% Case 2
    if y <= R.middle
        return FIND(R.leftChild, x, y) 
% Case 3
    if x >= R.middle + 1 
        return FIND(R.rightChild, x, y) 
% Case 4
    P = FIND(R.leftChild, x, R.middle)
    Q = FIND(R.rightChild, R.middle + 1, y)    
    return P union Q.

Intuitively, first three cases reduce the level of tree height by 1, since the tree has height log n, if only first three cases happen, the running time is O(log n).

For the last case, FIND() divide the problem into two subproblems. However, we assert that this can only happen at most once. After we called FIND(R.leftChild, x, R.middle), we are querying R.leftChild for the interval [x, R.middle]. R.middle is the same as R.leftChild.last. If x > R.leftChild.middle, then it is Case 1; if x <= R.leftChild, then we will call

FIND ( R.leftChild.leftChild, x, R.leftChild.middle );
FIND ( R.leftChild.rightChild, R.leftChild.middle + 1, , R.leftChild.last );

However, the second FIND() returns R.leftChild.rightChild.sum and therefore takes constant time, and the problem will not be separate into two subproblems (strictly speaking, the problem is separated, though one subproblem takes O(1) time to solve).

Since the same analysis holds on the rightChild of R, we conclude that after case4 happens the first time, the running time T(h) (h is the remaining level of the tree) would be

T(h) <= T(h-1) + c (c is a constant)
T(1) = c

which yields:

T(h) <= c * h = O(h) = O(log n) (since h is the height of the tree)

Hence we end the proof.

Source

Upvotes: 2

Related Questions