J.K.
J.K.

Reputation: 1618

Segmentation fault of an MPI program

I am writing a program with c++ that uses MPI. The simplified version of my code is

#include <iostream>
#include <fstream>
#include <cstdlib>
#include <mpi.h>
#define RNumber 3000000 //Number of loops to go

using namespace std;

class LObject {
        /*Something here*/
    public:
        void FillArray(long * RawT){
            /*Does something*/
            for (int i = 0; i < RNumber; i++){
                RawT[i] = i;
            }
        }
};

int main() {
    int     my_rank;
    int     comm_sz;
    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

    LObject System;

    long rawT[RNumber];
    long * Times = NULL;
    if (my_rank == 0) Times = (long*) malloc(comm_sz*RNumber*sizeof(long));

    System.FillArray(rawT);

    if (my_rank == 0) {
        MPI_Gather(rawT, RNumber, MPI_LONG, Times, RNumber,
                MPI_LONG, 0, MPI_COMM_WORLD);
    }
    else {
        MPI_Gather(rawT, RNumber, MPI_LONG, Times, RNumber,
                MPI_LONG, 0, MPI_COMM_WORLD);
    }

    MPI_Finalize();
    return 0;
};

The program compiles fine, but gives a Segmentation fault error on execution. The message is

=================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

When I reduce the RNumber the program works fine. Maybe somebody could explain what precisely goes wrong? Am I trying to allocate too much space for an array? If that's the case, will this problem be solved by storing the results in a file instead of an array?

If it is possible, could you please give broad comments on the things I do wrong.

Thank you for you time and effort!

Upvotes: 4

Views: 8887

Answers (3)

Mike Seymour
Mike Seymour

Reputation: 254751

A couple of possible issues:

long rawT[RNumber];

That's rather a large array to be putting on the stack. There is usually a limit to stack size (especially in a multithreaded program), and a typical size is one or two megabytes. You'd be better off with a std::vector<long> here.

Times = (long*) malloc(comm_sz*RNumber*sizeof(long));

You should check that the memory allocation succeeded. Or better still, use std::vector<long> here as well (which will also fix your memory leak).

if (my_rank == 0) {
    // do stuff
} else {
    // do exactly the same stuff
}

I'm guessing the else block should do something different; in particular, something that doesn't involve Times, since that is null unless my_rank == 0.

UPDATE: to use a vector instead of a raw array, just initialise it with the size you want, and then use a pointer to the first element where you would use a (pointer to) the array:

std::vector<long> rawT(RNumber);
System.FillArray(&rawT[0]);

std::vector<long> Times(comm_sz*RNumber);
MPI_Gather(&rawT[0], RNumber, MPI_LONG, &Times[0], RNumber,
           MPI_LONG, 0, MPI_COMM_WORLD);

Beware that the pointer will be invalidated if you resize the vector (although you won't need to do that if you're simply using it as a replacement for an array).

Upvotes: 2

Mankarse
Mankarse

Reputation: 40643

You are not checking the return value from malloc. Considering that you are attempting to allocate over three million longs, it is quite plausible that malloc would fail.

This might not be what is causing your problem though.

Upvotes: 0

AndersK
AndersK

Reputation: 36092

You may want to check what comes back from

MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

e.g. comm_sz==0 would cause this issue.

Upvotes: 1

Related Questions