rtheunissen
rtheunissen

Reputation: 7435

Sum (i...j) - is there a more efficient / elegant solution for this?

We were given a simple task to come up with the most efficient way we can to sum all the numbers between a start and end point ('from' and 'to') using recursion and iteration respectively, without using the obvious formula which would be O(1).

There is no application for this, I am simply curious and challenged to see if my solution can be improved / polished more than it already is:

/* recursion */
unsigned int sum1(unsigned int from, unsigned int to) {
    if (to - from < 2)
        return from + (from == to ? 0 : to);
    else
        return from + to + sum1(from + 1, to - 1);
}

/* iteration */
unsigned int sum2(unsigned int from, unsigned int to) {
    int p = to - from;
    if (p == 0) return from;
    int i, s, n = p / 2;
    if (p % 2 == 0) s = n + from;
    else {
        s = 0;
        n++;
    }
    for (i = 0; i < n; i++) {
        s += from++ + to--;
    }
    return s;
}

Upvotes: 4

Views: 463

Answers (5)

Graeme Newlands
Graeme Newlands

Reputation: 73

The function:

long sum(long lower, long upper) {
    long s = 0;
    s = ((upper * (upper + 1)) - (lower - 1) * (lower))/2;
    return s;
}

called with parameters: (1,9999999) returns 49999995000000 which agrees with summation formula n(n+1)/2 and runs with the following profile on a core duo:

real    0m0.005s
user    0m0.002s
sys     0m0.003s

It might be worth checking your functions, I can't see them returning this result - the mathematical option is a far superior solution ;)

Upvotes: 0

vasste
vasste

Reputation: 56

You can use a segment tree to get the sum on the segement from i to j. This structure has O(log n) lookup time.

Upvotes: 0

torrential coding
torrential coding

Reputation: 1765

I tried improving the iterative version:

unsigned int sum2_improved(unsigned int from, unsigned int to) {
    int p = to - from;
    if (p == 0) return from;
    int x = to + from;
    int s = 0;
    int i;
    for (i = p >> 1; i > 0; i--)
    {
        s += x;
    }
    s += (p % 2 == 0) ? x >> 1 : x;
    return s;
}

I tested your version with:

for (i = 0; i < 9999999; i++) sum2(1,999);

This is what I see:

$ time ./addm
real    0m18.315s
user    0m18.220s
sys     0m0.015s

I tried my implementation with the same number of loops. Here's how the improved function performed:

$ time ./addm
real    0m14.196s
user    0m14.070s
sys     0m0.015s

UPDATE

In my implementation x = to + from is the sum of the first and the last number in the sequence. If you consider any consecutive sequence of integers, and sum the first and last, the second and the penultimate, and so on ... all these sum up to the same value. For example, in (1 ... 6), 1 + 6 = 2 + 5 = 3 + 4 = 7. However, with a sequence containing odd number of elements, you are left with the middle number which you will then have to add to the cumulative sum (that's what the assignment following the for loop was doing.

Also, note that this is still O(n). I realized after I initially posted my answer that my approach can actually be done in constant time. Here's the updated code:

unsigned int sum0(unsigned int from, unsigned int to) {
    int p = to - from;
    if (p == 0) return from;
    int x = to + from;
    int s = 0;

    s += (x * (p >> 1));

    s += (p % 2 == 0) ? x >> 1 : x;

    return s;
}

I ran this with the same number of loops as the earlier tests. Here's what I saw:

$ time ./addm

real    0m0.158s
user    0m0.093s
sys     0m0.047s

I'm not sure if this can be considered a variation of the formula for your purposes. In any case, it was an interesting exercise for me.

Upvotes: 3

Reinstate Monica
Reinstate Monica

Reputation: 4723

Split the range (from zero to the upper limit n) in a lower and an upper half. For each value in the lower half there's a value in the upper half that's n/2 larger; there are n/2 of them, so the sum of the upper half is the sum of the lower half + (n/2)^2.

In Python that would be:

def sum1(lower, upper):
    if upper <= 0: 
        return 0
    m, r = divmod(upper, 2)
    return sum1(0, m) * 2 + m * m + r * upper - sum1(0, lower - 1)

Upvotes: 1

Mahmoud Al-Qudsi
Mahmoud Al-Qudsi

Reputation: 29539

I am not going to write the code for it, but this is the kind of thing that would directly scale with the number of cores you have working on the task.

Dividing the range into tasks and starting a thread to sum each subsection of the range would divide the time required for whatever implementation you choose by the number of cores (give or take).

You could also use SIMD extensions to facilitate the addition (vector addition) by writing out the data in the memory before hand. Taking it to another extreme, you could actually use the GPU to calculate the addition of subranges (but you'd have to have a big enough range to make it worth the overhead), making it stupid fast; as this question is as simple as you can get with no dependencies whatsoever between the instructions.

Upvotes: 0

Related Questions