Alessandroempire
Alessandroempire

Reputation: 1699

OpenMPI parallelize reading a Text File

What I wish to do with this code is the following:

Read a file into a buffer (works good!) (And don't wish to change how I read the file nor how I stored it).

Send that buffer using MPI_Scatter across several "Nodes" So each node can count the number of times there is a blank space.

The code I have made is the following:

#include <stdio.h>
#include <mpi.h> 

int main() {

int file_size = 10000;
FILE * fp;
int my_size, my_id, size, local_acum=0, acum=0, i;
char buf[file_size], recv_vect[file_size];

fp = fopen("pru.txt","r");
fseek(fp, 0L, SEEK_END);
size = ftell(fp);
fseek(fp, 0L, SEEK_SET);
fread (buf,1,size,fp);

// Initialize the MPI environment 
MPI_Init(NULL, NULL); 
MPI_Comm_size(MPI_COMM_WORLD, &my_size); 
MPI_Comm_rank(MPI_COMM_WORLD,&my_id);

MPI_Scatter(buf, size / my_size, MPI_CHAR, recv_vect, 
    size / my_size, MPI_CHAR, 0, MPI_COMM_WORLD);

local_acum=0;
for (i=0; i < size / my_size; i++){
    // printf("%c", buf[i]);
    if (buf[i] == ' '){
        local_acum++;
    }
}
printf("\nlocal is %d \n", local_acum);

acum=0;
MPI_Barrier(MPI_COMM_WORLD); 
MPI_Reduce(&local_acum, &acum, 1, MPI_INT, MPI_SUM, 
    0, MPI_COMM_WORLD);

if (my_id == 0){
    printf("Counter is %d \n", acum);
}

// Finalize the MPI environment. 
MPI_Finalize();
}

I am not getting the desired result.

If I run with the option -np 1 It works perfect (as expected).

Yet when I run with the option -np 2 or higher, I do not get my desire result. The behavior of each node is that it counts always the same amount of blank spaces! I believe this is the key to the problem.

If in the nodes for I do

for (i=0; i < sie; i++)

This counts the number of blank spaces. So each node has the whole buffer. I do not understand why since in the scatter I am telling to pass (size / my_size)

Upvotes: 0

Views: 411

Answers (1)

Zulan
Zulan

Reputation: 22670

  1. You are iterating over buf, which contains the entire file, instead of recv_vect, which contains only the part for each rank.
  2. You are reading the whole file on each node, not just on the root. That doesn't make any sense in your case.

Upvotes: 1

Related Questions