Reputation: 3747
I have an MPI code where processes read a binary file and write it back again. The way the data is distributed is that process 0 reads (and then writes) the first half of the file whereas process 1 reads (and then writes) the second half of the file. The issue here now is that the input and the output files do not match (diff shows that they differ). If there is only 1 process, everything works ok. Can someone point out what is going wrong?
Using OpenMPI, compiled as: mpicc -Wall test_mpi.c -o test_mpi
Run as: mpirun -np 2 ./test_mpi
Thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char** argv) {
int rank, np, i; //np = no. of processes
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &np);
int filesize = 48*1048576; //input filesize 48MB
double *data = (double*) malloc (filesize/np);
FILE* fpa;
fpa = fopen ( "512_featurevec.out", "rb");
fseek(fpa, filesize/np*rank, SEEK_SET);
printf("read: %d\n", (int)fread(&data[0], sizeof(double), filesize/(np*sizeof(double)), fpa));
fclose(fpa);
char* outfile = "outfile.txt";
for(i=0; i<np; i++) {
if(rank == i) {
fpa = fopen ( outfile, "ab");
fseek(fpa, filesize/np*rank, SEEK_SET);
fwrite ( &data[0], sizeof(double), filesize/(np*sizeof(double)), fpa);
fclose ( fpa );
}
}
free(data);
MPI_Finalize();
exit(0);
}
Upvotes: 2
Views: 3275
Reputation: 50947
If you're already using MPI and going to the trouble of using seek to partition up the file, rather than using POSIX I'd suggest using MPI-IO (standard as part of MPI2, c. 1996 or so): Good references are:
and at our centre we have the first part of this, which I think is pretty good:
An MPI-IOed version of your code above is this:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char** argv) {
int rank, np;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &np);
const int filesize = 48*1048576; //input filesize 48MB
const int ndoubles = filesize/(sizeof(double)*np);
double *localdata = malloc(ndoubles*sizeof(double));
/* create a type which describes our view of the file --
* in particular, just our subarray of the global array
*/
int globalsizes[1] = {filesize};
int localsizes[1] = {ndoubles};
int starts[1] = {ndoubles*rank};
MPI_Datatype fileview;
MPI_Type_create_subarray(1, globalsizes, localsizes, starts, MPI_ORDER_C, MPI_DOUBLE, &fileview);
MPI_Type_commit(&fileview);
/* read in only our data */
MPI_File fpa;
MPI_Status status;
MPI_File_open(MPI_COMM_WORLD, "512_featurevec.out", MPI_MODE_RDONLY, MPI_INFO_NULL, &fpa);
/* note could use MPI_File_seek instead of file set view */
MPI_File_set_view(fpa, (MPI_Offset)0, MPI_DOUBLE, fileview, "native", MPI_INFO_NULL);
MPI_File_read_all(fpa, localdata, ndoubles, MPI_DOUBLE, &status);
MPI_File_close(&fpa);
/* write out data - it will have same layout, we're just writing instead of erading*/
MPI_File_open(MPI_COMM_WORLD, "output.dat", MPI_MODE_WRONLY|MPI_MODE_CREATE, MPI_INFO_NULL, &fpa);
/* note could use MPI_File_seek instead of file set view */
MPI_File_set_view(fpa, (MPI_Offset)0, MPI_DOUBLE, fileview, "native", MPI_INFO_NULL);
MPI_File_write_all(fpa, localdata, ndoubles, MPI_DOUBLE, &status);
MPI_File_close(&fpa);
free(localdata);
MPI_Type_free(&fileview);
MPI_Finalize();
return 0;
}
Upvotes: 3
Reputation: 3747
Answering my own question, but this works:
char* outfile = "outfile.txt";
for(i=0; i<np; i++) {
if(rank == i) {
fpa = fopen ( outfile, "ab");
fseek(fpa, filesize/np*rank, SEEK_SET);
fwrite ( &data[0], sizeof(double), filesize/(np*sizeof(double)), fpa);
fclose ( fpa );
}
MPI_Barrier(comm);
}
For some reason, if proc 1 arrives before proc 0 and fseeks an empty file to some value, it does position the file pointer to the correct offset (verified by ftell), but writes from offset 0 only. (I must be doing something trivially wrong, but anyway).
Upvotes: 0
Reputation: 26281
It looks like the issue is due to the fact that each child is opening the file for writing, which leads to contention.
Try having the file name depend on the rank (for example, writing to out file.txt.(rank)
and see if all of the outputs match up.
Upvotes: 3