Karthik Nishanth
Karthik Nishanth

Reputation: 2010

Performance issue in Finite difference method

I wrote a piece of C code that uses finite difference method to estimate values. This is an averaging method. I profiled the code and found that one iterate() function is the slowest.

void iterate(double data[][ARRAY_SIZE], int nx, int ny, int dx, int dy)
{
    for (int i = 0; i < nx; ++i)
    {
        for (int j = 0; j < ny; ++j)
        {
            if (i % (dx + 1) == 0 && j % (dy + 1) == 0)
                continue;
            else if (i == 0 && 0 < j && j < ny)
                data[i][j] = (data[i][j - 1] + data[i][j + 1] + data[i + 1][j]) / 3;
            else if (j == 0 && 0 < i && i < nx)
                data[i][j] = (data[i - 1][j] + data[i + 1][j] + data[i][j + 1]) / 3;
            else if (i == nx - 1 && 0 < j && j < ny)
                data[i][j] = (data[i][j - 1] + data[i][j + 1] + data[i - 1][j]) / 3;
            else if (j == ny - 1 && 0 < i && i < nx)
                data[i][j] = (data[i - 1][j] + data[i + 1][j] + data[i][j - 1]) / 3;
            else
                data[i][j] = (data[i - 1][j] + data[i + 1][j] + data[i][j - 1] + data[i][j + 1]) / 4;
        }
    }
}

This loop runs slow, and I am not sure what I am missing here that makes it slow. Is there a better way of doing the same?

2000 iterations with a 400x400 double array takes

real    0m1.950s
user    0m1.940s
sys 0m0.004s

Upvotes: 0

Views: 107

Answers (2)

Patrick Roberts
Patrick Roberts

Reputation: 51876

Consider this implementation:

void iterate(double data[][ARRAY_SIZE], int nx, int ny, int dx, int dy)
{
    // because nx - 1 and ny - 1 are used
    nx--;
    ny--;
    // because dx + 1 and dy + 1 are used
    dx++;
    dy++;

    int i = 0;
    int j = 0;

    // case i == 0 && 0 < j && j < ny
    for (j = 1; j < ny; ++j)
    {
        if (j % dy)
            data[0][j] = (data[i][j - 1] + data[i][j + 1] + data[i + 1][j]) / 3.0;
    }

    j = 0;

    // case j == 0 && 0 < i && i < nx
    for (i = 1; i < nx; ++i)
    {
        if (i % dx)
            data[i][0] = (data[i - 1][j] + data[i + 1][j] + data[i][j + 1]) / 3.0;
    }

    // default case
    for (i = 1; i < nx; ++i)
    {
        for (j = 1; j < ny; ++j)
        {
            if (i % dx || j % dy)
                data[i][j] = (data[i - 1][j] + data[i + 1][j] + data[i][j - 1] + data[i][j + 1]) * 0.25;
        }
    }

    // case i == nx && 0 < j && j < ny
    for (j = 1; j < ny; ++j)
    {
        if (nx % dx || j % dy)
            data[nx][j] = (data[i][j - 1] + data[i][j + 1] + data[i - 1][j]) / 3.0;
    }

    // case j == ny && 0 < i && i < nx
    for (i = 1; i < nx; ++i)
    {
        if (ny % dy || i % dx)
            data[i][ny] = (data[i - 1][j] + data[i + 1][j] + data[i][j - 1]) / 3.0;
    }
}

The main three points are:

  1. reduce the amount of operations in the inner loop for the double for-loop
  2. reduce the amount of trivial operations by doing them only once
  3. don't mix data types and force coercion (use / 3.0 and * 0.25)

The only thing in my code not explained is that i % dx || j % dy is equal to !(i % dx == 0 && j % dy == 0).

Upvotes: 1

John Zwinck
John Zwinck

Reputation: 249153

Here are some ideas:

  1. It appears that ny must equal ARRAY_SIZE. You may as well omit it as a parameter and just use the compile-time constant.
  2. All the if/else clauses except the final one are only applicable to a specific row or column. So hoist them out. For example you can process the first row and column as 1D loops before doing the entire matrix outside the edges, then finally process the rightmost column and bottom row.

In the end, your core loop should be more like this:

for (int i = 1; i < nx - 1; ++i)
{
    for (int j = 1; j < ARRAY_SIZE - 1; ++j)
    {
        data[i][j] = (data[i - 1][j] + data[i + 1][j] + data[i][j - 1] + data[i][j + 1]) / 4;
    }
}

Upvotes: 3

Related Questions