Reputation: 469

3-sum alternative approach

I tried an alternative approach to the 3sum problem: given an array find all triplets that sum up to a given number.

Basically the approach is this: Sort the array. Once a pair of elements (say A[i] and A[j]) is selected, a binary search is done for the third element [using the equal_range function]. The index one past the last of the matching elements is saved in a variable 'c'. Since A[j+1] > A[j], we to search only upto and excluding index c (since numbers at index c and beyond would definitely sum greater than the target sum). For the case j=i+1, we save the end index as 'd' instead and make c=d. For the next value of i, when j=i+1, we need to search only upto and excluding index d.

C++ implementation:

int sum3(vector<int>& A,int sum)
{
    int count=0, n=A.size();
    sort(A.begin(),A.end());
    int c=n, d=n;  //initialize c and d to array length
    pair < vector<int>::iterator, vector<int>::iterator > p;
    for (int i=0; i<n-2; i++)
    {
        for (int j=i+1; j<n-1; j++)
        {
            if(j == i+1)
            {
                p=equal_range (A.begin()+j+1, A.begin()+d, sum-A[i]-A[j]);
                d = p.second - A.begin();
                if(d==n+1) d--;
                c=d;
            }
            else
            {
                p=equal_range (A.begin()+j+1, A.begin()+c, sum-A[i]-A[j]);
                c = p.second - A.begin();
                if(c==n+1) c--;
            }
            count += p.second-p.first;
            for (auto it=p.first; it != p.second; ++it) 
                cout<<A[i]<<' '<<A[j]<<' '<<*it<<'\n';
        }
    }
    return count;
}

int main()      //driver function for testing
{
    vector <int> A = {4,3,2,6,4,3,2,6,4,5,7,3,4,6,2,3,4,5};
    int sum = 17;
    cout << sum3(A,sum) << endl;
    return 0;
}

I am unable to work out the upper bound time needed for this algorithm. I understand that the worst case scenario will be when the target sum is unachievably large.

My calculations yield something like:

For i=0, no. of binary searches is lg(n-2) + lg(n-3) + ... +lg(1)

For i=1, lg(n-3) + lg(n-4) + ... + lg(1)

...

For i=n-3, lg(1)

So totally, lg((n-2)!) + lg((n-3)!) + ... + lg(1!) = lg(1^n*2^(n-1)3^(n-2)...*(n-1)^2*n^1)

But how to deduce the O(n) bound from this expression?

Upvotes: 1

Answers (3)

satvik choudhary

Reputation: 124

In addition to James' good answer I would like to point out that this can actually go upto O (n^3) in the worst case because you are running 3 nested for loops. Consider the case

{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}

and the demanded sum is 3.

Upvotes: 2

Gassa

Reputation: 8846

1. For your analysis, note that log(1) + log(2) + ... + log(k) = Theta(k log(k)). Indeed, the upper half of this sum is log(k/2) + log(k/2+1) + ... + log(k), so it is at least log(k/2)*k/2, which is asymptotically the same as log(k)*k already. Similarly, we can conclude that

log(n-1) + log(n-2) + log(n-3) + ... + log(1) +  // Theta((n-1) log(n-1))
           log(n-2) + log(n-3) + ... + log(1) +  // Theta((n-2) log(n-2))
                      log(n-3) + ... + log(1) +  // Theta((n-3) log(n-3))
                                 ... +
                                       log(1) = Theta(n^2 log(n))

Indeed, if we consider the logarithms which are at least log(n/2), it's the half-triangle (thus ~1/2) of the upper left quadrant (thus ~n^2/4) of the above sum, so there are Theta(n^2/8) such terms.

2. As noted by satvik in another answer, your output loop can take up to Theta(n^3) steps when the number of outputs itself is Theta(n^3), which is when they are all equal.

3. There are O(n^2) solutions to the 3-sum problem, which are therefore asymptotically faster than this one.

Upvotes: 0

James Poag

Reputation: 2380

When computing complexity, I'll start by referring to the Big-O Cheat sheet. I use this sheet to classify smaller sections of the code to get their runtime performance.

E.g. if I had a simple loop it would be O(n). BinSearch (according to the cheat sheet) is O(log(n)), etc..

Next, I use the Properties of Big-O notation to composite the smaller pieces together.

So for instance if I had two loops independent of each other it would be O(n) + O(n) or O(2n) => O(n). If one of my loops were inside the other, I would multiply them. So g( f(x) ) turns into O(n^2).

Now, I know you're saying: "hey, wait, I'm changing the upper and lower bounds of the inner loop" but I don't think that really matters...here's a university level example.

So my back-of-the-napkin calculation of your runtime is O(n^2) * O(Log(n)) or O(n^2 Log(n)).

But this need not be the case. I could've done something horribly wrong. So my next step would be to start graphing the runtimes of your worst possible case. Set sum to the impossibly large value and generate larger and larger arrays. You can avoid integer overflow by using lots and lots of repeated smaller numbers.

Also, compare it to the Quadratic 3Sum Solution. That's a known O(n^2) solution. Be sure to compare worst cases, or at least the same array on both. Do both timed tests at the same time so you can start getting a feel for which is faster while you are empirically testing the runtime.

Release builds, optimized for speed.

Upvotes: 1

3-sum alternative approach

Answers (3)

Related Questions