Alf
Alf

Reputation: 1989

Summing up array elements using OpenMP is not working correctly for large arrays (C)

I wanted to use OpenMP around a for-loop that basically sums up the array elements. For comparison, I also sum up the array elements in a serial fashion. For large array-sizes, array_size>=5795, the sums are no longer equal.

This is the OpenMP-parallelized loop:

#pragma omp parallel for reduction(+:parallel_sum)
    for (i = 0; i < array_size; i++) {
        parallel_sum += array[i];
    }

And this is the full code:

#include<stdio.h>
#include<stdlib.h>
#include<omp.h>

int main() {
    float          *array, serial_sum, parallel_sum;
    size_t          array_size, i;

    serial_sum   = 0.;
    parallel_sum = 0.;

#pragma omp parallel
    printf( "number of threads used in parallized part: %d\n", omp_get_num_threads());

    printf("enter size of array\n");
    scanf("%zu", &array_size);

    // memory allocation
    array = (float *) malloc(sizeof(float) * array_size);

    // initialize array
    for (i = 0; i < array_size; i++)
        array[i] = i;

    // calculate the array sums, parallel first, then serial
#pragma omp parallel for reduction(+:parallel_sum)
    for (i = 0; i < array_size; i++) {
        parallel_sum += array[i];
    }

    for (i = 0; i < array_size; i++)
        serial_sum = serial_sum + array[i];

    if (serial_sum == parallel_sum)
        printf("\nserial and parallel sums are equal\n");
    else
        printf("\nserial and Parallel sums are NOT equal\n");

    // free memory
    free(array);

    return 0;
}

Since the sum can be easily calculated by hand using the sum-formula by Gauss, sum = (n^2+n)/2, where n corresponds to array_size, both sums are wrong... I guess I am having some memory issues then...?

Upvotes: 0

Views: 1141

Answers (1)

user3185968
user3185968

Reputation:

Since floating-point types have limited precision, addition is not associative (see Are floating point operations in C associative? and Is Floating point addition and multiplication associative?).

So a + (b + c) can evaluate unequal to (a + b) + c.

When parallelizing/vectorizing, the order operations can indeed be different than adding the numbers from lowest to highest sequentially.

If you printed out the computed values, you would likely find them quite close in this case, just not exactly equal.

Upvotes: 1

Related Questions