Aulene De
Aulene De

Reputation: 21

How do I compute the sum of differences in C++ given an array of N integers?

Problem Statement: https://www.codechef.com/ZCOPRAC/problems/ZCO13001

My code falls flat with a 2.01 second runtime on test cases 4 and 5. I cannot figure out the problem with my code:-

#include<iostream>
#include<cstdio>
#include<algorithm>

using namespace std;

int summation(long long int a[], int n, int count)
{
    long long int sum=0;
    int i;
    if(count==n)
        return 0;
    else
    {           
        for(i=count; i<n; i++)
            sum+=a[i]-a[count];

    }
    return sum+summation(a, n, count+1);
}

int main()
{
    int n, i;
    long long int sum;
    scanf("%d", &n);
    long long int a[n];

    for(i=0; i<n; i++)
        scanf("%lld", &a[i]);
    sort(a, a+n);

    sum=summation(a, n, 0);
    printf("%lld\n", sum);
    return 0;
}

Thanks!

Upvotes: 2

Views: 1837

Answers (3)

Peter M.
Peter M.

Reputation: 713

Here's my solution with a quicker algorithm.

Everything below is spoilers, in case you wanted to solve it yourself.

--

long long int summation_singlepass(long long int a[], int n)
{
    long long int grand_total=0;
    long long int iteration_sum, prev_iteration_sum=0;
    int i;

    for (i = 1; i < n; i++) {
        iteration_sum = prev_iteration_sum + i * ( a[i] - a[i-1] );
        grand_total += iteration_sum;
        prev_iteration_sum = iteration_sum;
    }

    return grand_total;
}

--

To figure out an algorithm, take a few simple but meaningful cases. Then work through them step by step yourself. This usually gives good insights.

For example: 1,3,6,6,8 (after sorting)

Third element in series. Its sum of differences to previous elements is: (6-1) + (6-3) = 8

Fourth element in series. No change! Sum of differences to previous elements is: (6-1) + (6-3) + (6-6) = 8

Fifth element in series. Pattern emerges when compared to formula for third and fourth. Sum of differences to previous elements is: (8-1) + (8-3) + (8-6) + (8-6) = 16

So it's an extra 2 for each prior element in the series. 2 is the difference between our current element (8) and the previous one (6).

To generalize this effect. Derive the current iteration sum as the previous iteration sum + (i - 1) * ( a[i] - a[i-1] ). Where i is our current (1-based) position in the series.

Note that the formula looks slightly different in code compared to how I wrote it above. This is because in C++ we're working with 0-based indices for arrays.

Also - if you wanted to continue tweaking the solution you posted in OP, change the function return of summation to long long int to handle larger sets without the running total getting chopped.

Upvotes: 0

John D
John D

Reputation: 1637

Your code works, but there are two issues.

Using recursion will eventually run out of stack space. I ran your code for n=200000 (upper limit in Code Chef problem) and got stack overflow.

I converted the recursion to an equivalent loop. This hit the second issue - it takes a long time. It is doing 2*10^5 * (2*10^5 - 1) / 2 cycles which is 2*10^10 roughly. Assuming a processor can run 10^9 cycles a second, you're looking at 20 seconds.

To fix the time issue, look for duplicates in team strength value. Instead of adding the same strength value (val) each time it appears in the input, add it once and keep a count of how many times it was found (dup). Then, when calculating the contribution of the pair (i,j), multiply a[i].val - a[j].val by the number of times this combo appeared in raw input, which is the product of the two dup values a[i].dup * a[j].dup.

Here's the revised code, using Strength struct to hold the strength value & the number of times it occurred. I didn't have a handy input file, so used random number generator with range 1,100. By cycling through only the unique strength values, the total number of cycles is greatly reduced.

#include<iostream>
#include<cstdio>
#include<algorithm>
#include<random>

using namespace std;

int codechef_sum1(long long int a[], int n, int count)
{
    long long int sum = 0;
    int i;
    if (count == n)
        return 0;
    else
    {
        for (i = count; i<n; i++)
            sum += a[i] - a[count];

    }
    return sum + codechef_sum1(a, n, count + 1);
}

int codechef_sum2a(long long int a[], int n)
{
    long long int sum = 0;
    for (int i = 0; i < n; i++)
    for (int j = 0; j < i; j++)
        sum += (a[i] - a[j]);
    return sum;
}

struct Strength
{
    long long int val;
    int dup;
    //bool operator()(const Strength& lhs, const Strength & rhs) { return lhs.val < rhs.val; }
    bool operator<(const Strength & rhs) { return this->val < rhs.val; }
};


int codechef_sum2b(Strength a[], int n)
{
    long long int sum = 0;
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < i; j++)
            sum += (a[i].val - a[j].val) * (a[i].dup * a[j].dup);
    }
    return sum;
}

int codechef_sum_test(int n)
{
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distr(1, 100);
    auto a1 = new long long int[n];
    auto a2 = new Strength [n];
    int dup = 0, num = 0;

    for (int i = 0; i < n; i++)
    {
        int r = distr(generator);
        a1[i] = r;
        int dup_index = -1;
        for (int ii = 0; ii < num; ii++)
        {
            if (a2[ii].val == r)
            {
                dup++;
                dup_index = ii;
                break;
            }
        }
        if (dup_index == -1)
        {
            a2[num].val = r;
            a2[num].dup = 1;
            ++num;
        }
        else
        {
            ++a2[dup_index].dup;
        }
    }
    sort(a1, a1 + n);
    sort(a2, a2 + num);

    auto sum11 = codechef_sum1(a1, n, 0);
    auto sum12 = codechef_sum2a(a1, n);
    auto sum2 = codechef_sum2b(a2, num);
    printf("sum11=%lld, sum12=%lld\n", sum11, sum12);
    printf("sum2=%lld, dup=%d, num=%d\n", sum2, dup, num);
    delete[] a1;
    delete[] a2;

    return 0;
}

void main()
{
    codechef_sum_test(50);
}

Upvotes: 0

Rahul Nori
Rahul Nori

Reputation: 718

First of all you are on the correct track when you are sorting the numbers, but the complexity of your algorithm is O(n^2). What you want is an O(n) algorithm.

I'm only going to give you a hint, after that how you use it is up to you.

Let us take the example given on the site you specified itself i.e. 3,10,3,5. You sort these elements to get 3,3,5,10. Now what specifically are the elements of the sum of the differences in this? They are as follows -

3-3
5-3
10-3
5-3
10-3
10-5

Our result is supposed to be (3-3) + (5-3) + ... + (10-5). Let us approach this expression differently.

3 - 3
5 - 3
10 - 3
5 - 3
10 - 3
10 - 5


43 - 20

This we get by adding the elements on the left side and the right side of the - sign.
Now take a variable sum = 0.
You need to make the following observations -
As you can see in these individual differences how many times does the first 3 appear on the right side of the - sign ?
It appears 3 times so let us take sum = -3*3 = -9.

Now for the second 3 , it appears 2 times on the right side of the - sign and 1 time on the left side so we get (1-2)*3 = -3. Adding this to sum we get -12.

Similarly for 5 we have 2 times on the left side and 1 time on the right side. We get (2-1)*5 = 5. Adding this to sum we get -12+5 = -7.

Now for 10 we have it 3 times on the left side i.e. 3*10 so sum is = -7+30 = 23 which is your required answer. Now what you need to consider is how many times does a number appear on the left side and the right side of the - sign. This can be done in O(1) time and for n elements it takes O(n) time. There you have your answer. Let me know if you don't understand any part of this answer.

Upvotes: 1

Related Questions