Reputation: 149
I have been trying for more than two days to see what mistakes I have done but I couldn't find anything. I keep getting the following error:
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
make: *** [run] Error 139
So the problem clearly in MPI_BCAST
and in another function I have MPI_GATHER
.
Can you help me figure out what's wrong?
When I compile the code I type the following:
/usr/bin/mpicc -I/usr/include -L/usr/lib z.main.c z.mainMR.c z.mainWR.c -o 1dcode -g -lm
For run:
usr/bin/mpirun -np 2 ./1dcode dat.txt o.out.txt
For example my code includes this function:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <math.h>
#include <string.h>
#include "functions.h"
#include <mpi.h>
/*...................z.mainMR master function............. */
void MASTER(int argc, char *argv[], int nPROC, int nWRs, int mster)
{
/*... Define all the variables we going to use in z.mainMR function..*/
double tend, dtfactor, dtout, D, b, dx, dtexpl, dt, time;
int MM, M, maxsteps, nsteps;
FILE *datp, *outp;
/*.....Reading the data file "dat" then saving the data in o.out.....*/
datp = fopen(argv[1],"r"); // Open the file in read mode
outp = fopen(argv[argc-1],"w"); // Open output file in write mode
if(datp != NULL) // If data file is not empty continue
{
fscanf(datp,"%d %lf %lf %lf %lf %lf",&MM,&tend,&dtfactor,&dtout,&D,&b); // read the data
fprintf(outp,"data>>>\nMM=%d\ntend=%lf\ndtfactor=%lf\ndtout=%lf\nD=%lf\nb=%lf\n",MM,tend,dtfactor,dtout,D,b);
fclose(datp); // Close the data file
fclose(outp); // Close the output file
}
else // If the file is empty then print an error message
{
printf("There is something wrong. Maybe file is empty.\n");
}
/*.... Find dx, M, dtexpl, dt and the maxsteps........*/
dx = 1.0/ (double) MM;
M = b * MM;
dtexpl = (dx * dx) / (2.0 * D);
dt = dtfactor * dtexpl;
maxsteps = (int)( tend / dt ) + 1;
/*...Pack integers in iparms array, reals in parms array...*/
int iparms[2] = {MM,M};
double parms[4] = {dx, dt, D, b};
MPI_BCAST(iparms,2, MPI_INT,0,MPI_COMM_WORLD);
MPI_BCAST(parms, 4, MPI_DOUBLE,0, MPI_COMM_WORLD);
}
Upvotes: 3
Views: 14936
Reputation: 74485
The runtime error is due to an unfortunate combination of a specific trait of MPICH and a feature of the C language.
MPICH provides both C and Fortran interface code within a single library file:
000000000007c7a0 W MPI_BCAST
00000000000cd180 W MPI_Bcast
000000000007c7a0 W PMPI_BCAST
00000000000cd180 T PMPI_Bcast
000000000007c7a0 W mpi_bcast
000000000007c7a0 W mpi_bcast_
000000000007c7a0 W mpi_bcast__
000000000007c7a0 W pmpi_bcast
000000000007c7a0 T pmpi_bcast_
000000000007c7a0 W pmpi_bcast__
The Fortran calls are exported under a variety of aliases in order to support many different Fortran compilers at the same time, including the all upper case MPI_BCAST
. MPI_BCAST
itself is not declared in mpi.h
but ANSI C allows for calling functions without preceding prototype declarations. Enabling C99 by passing -std=c99
to the compiler would have resulted into a warning about implicit declaration of the MPI_BCAST
function. Also -Wall
would have resulted in a warning. The code will fail to link with Open MPI, which provides the Fortran interface in a separate library that mpicc
does not link against.
Even if the code compiles and links properly, Fortran functions expect all their arguments to be passed by reference. Also, Fortran MPI calls take an additional output argument where the error code is returned. Therefore the segmentation fault.
To prevent such errors in the future, compile with -Wall -Werror
, which should catch similar problems as early as possible.
Upvotes: 7
Reputation: 4624
Just so this has a formal answer: you spelled MPI_Bcast
as MPI_BCAST
. I would have assumed that this would have thrown a linking error at you for trying to access a function that doesn't exist, but apparently it didn't.
My guess is that your MPI implementation defines both the Fortran and C MPI functions in the same header file. Your program then was accidentally calling the Fortran function MPI_BCAST
and the types were not adding up (MPI_INTEGER
(Fortran) is not necessarily MPI_INT
(C)), somehow giving you the segfault.
Upvotes: 2