I can't understand how the time complexity of following c++ code is calculated?

Question

Following code prints union of two unsorted arrays using C++ STL set.
I know that the time complexity of inserting an element in a set is O(log N), where N is the size of the set.
This code is from https://www.geeksforgeeks.org/find-union-and-intersection-of-two-unsorted-arrays/.

// C++ program for the union of two arrays using Set
#include 
using namespace std;
void getUnion(int a[], int n, int b[], int m)
{
 
    // Defining set container s
    set s;

    // Inserting array elements in s
    for (int i = 0; i < n; i++)
      s.insert(a[i]);

    for (int i = 0; i < m; i++)
      s.insert(b[i]);
    cout << "Number of elements after union operation: " << s.size() << endl;
    cout << "The union set of both arrays is :" << endl;
    for (auto itr = s.begin(); itr != s.end(); itr++)
        cout << *itr
             << " "; // s will contain only distinct
                 // elements from array a and b
}

// Driver Code
int main()
{
    int a[9] = { 1, 2, 5, 6, 2, 3, 5, 7, 3 };
    int b[10] = { 2, 4, 5, 6, 8, 9, 4, 6, 5, 4 };

    getUnion(a, 9, b, 10);
}

Time Complexity: O(m * log(m) + n * log(n)).
Please explain how the above time complexity is calculated.

Dhruv Saraswat · Accepted Answer

In C++, a set is internally implemented using a self-balancing Binary Search Tree (BST). This means that everytime a new element is inserted into a set, internally it has to check the correct place of insertion of this new element in the BST, and then rebalance that BST. This operation of inserting a new element and performing rebalance of the BST takes approximately O(log(n)) time where n is the number of elements in the BST.

The below code snippet runs n times, and each insert operation has O(log(n)) time complexity in the worst case, hence the time complexity of the below piece of code is O(n * log(n)).

    for (int i = 0; i < n; i++)
      s.insert(a[i]);

Now, after the above code snippet is executed, the set will have n elements in the worst case (if all the n elements in int a[] are unique).

The next code snippet provided below runs m times, and each insert operation has O(log(n + m)) time comlexity in the worst case (because there might already be n elements in the set before this below loop starts), hence the time complexity of the below piece of code is O(m * log(m + n)).

    for (int i = 0; i < m; i++)
      s.insert(b[i]);

This last for loop below runs for all the n + m elements in the worst case (if all the n elements in array a[] and all the m elements in array b[] are unique). Hence, the time complexity of the below code is O(n + m), because the below code visits all the n + m elements in the internal BST.

    for (auto itr = s.begin(); itr != s.end(); itr++)
        cout << *itr
             << " "; // s will contain only distinct
                 // elements from array a and b

All the above 3 code snippets run serially (one after the other), hence the total time complexity has to be added to find the final time complexity.

Time Complexity = O(nlog(n) + mlog(m + n) + (m + n))

Out of all the above 3 terms, nlog(n) and mlog(m + n) terms are larger than (m + n), hence we can omit (m + n) and write the time complexity as O(nlog(n) + mlog(m + n)).

Relevant Links -

I can't understand how the time complexity of following c++ code is calculated?

Answers (2)

Related Questions

I can&#39;t understand how the time complexity of following c++ code is calculated?

Answers (2)

Related Questions

I can't understand how the time complexity of following c++ code is calculated?