ProgProb
ProgProb

Reputation: 23

fseek() with fork() not working properly

I'm having a problem using fseek() in combination with fork() (actually using XUbuntu 15.10). I have to write a program that reads a series of numbers (in different lines) from a file ("file1"), and reports the number of lines matching with a specific string, passed as an argument.
The program, which I called "nf", has to be called as follow (./nf number_of_processes file1 match_string).
The program should create number_of_processes child processes (using fork()), and each of them should process a single section of the file (i.e. if number_of_processes is 5 and the file has 15 lines, each child process should process 15/5=3 lines of the file).
The child processes should then report the results to the father, which will print the number of occurrences found in the file.

Now, the problem is: I wrote the program using fseek (each child process finds its correct position inside the file and start analyzing it for the length of a single section), but sometimes it seems working, and some other it prints an incorrect result, just like if it was reading the file in an incorrect way (reading it multiple times or reading garbage instead of numeric strings)...
Do you know why is this happening?
Thanks a lot in advance.

The files are the following ones:
file1:

1224332
1224332
4363666
4363666
1224332
5445774
2145515
1224332
2145515
1111111
2145515
9789899
2344444
6520031
4363666

nf.c:

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

#define NBYTES 8

int FileLenght(FILE *fp) {
    int cnt=0;
    char c;

    while((c=getc(fp))!=EOF) {
        if(c=='\n') {
            cnt++;
        }
    }

    rewind(fp);

    return cnt;
}

int main (int argc, char *argv[]) {
    int num,vlen=0,i,j=0,cnt=0;
    pid_t *pid;
    int status,sum=0;
    FILE *fp;
    char string[NBYTES+1];

    if(argc!=4) {
        printf("Error using program.\n");
        exit(EXIT_FAILURE);
    }

    num=atoi(argv[1]);

    fp=fopen(argv[2],"r+");

    if(!fp) {
        fprintf(stderr,"Error opening file.\n");
        exit(EXIT_FAILURE);
    }

    vlen=FileLenght(fp);

    pid=malloc(num*sizeof(pid_t));

    for(i=0;i<num;i++) {
        if(!(pid[i]=fork())) {
            fseek(fp,i*(NBYTES)*(vlen/num),SEEK_SET);
            while(j<vlen/num) {
                fscanf(fp,"%s",string);
                printf("Process %d reading from file: %s\n",getpid(),string);
                if(!strcmp(string,argv[3])) {
                    cnt++;
                }
                j++;
                printf("(%d-%d) %d %s=%s\n",getpid(),getppid(),j,string,argv[3]);
            }
            fclose(fp);
            exit(cnt);
        }
    }

    fseek(fp,vlen*NBYTES,SEEK_SET);

    for(i=0;i<num;i++) {
        waitpid(pid[i],&status,0);
        sum+=WEXITSTATUS(status);
    }

    printf("\nTotal found: %d\n",sum);

    fclose(fp);
    free(pid);

    return 0;
}

Output (the correct count should be 4 and not 5):

$ ./nf 5 file1 1224332
Process 18592 reading from file: 1224332
Process 18593 reading from file: 4363666
(18593-18591) 1 4363666=1224332
Process 18593 reading from file: 4363666
(18593-18591) 2 4363666=1224332
(18592-18591) 1 1224332=1224332
Process 18592 reading from file: 1224332
(18592-18591) 2 1224332=1224332
Process 18594 reading from file: 1224332
Process 18596 reading from file: ���ҿ�
(18594-18591) 1 1224332=1224332
Process 18595 reading from file: 2145515
(18595-18591) 1 2145515=1224332
Process 18595 reading from file: 1224332
(18595-18591) 2 1224332=1224332
(18596-18591) 1 ���ҿ�=1224332
Process 18596 reading from file: ���ҿ�
Process 18594 reading from file: 1224332
(18594-18591) 2 1224332=1224332
(18596-18591) 2 ���ҿ�=1224332

Total found: 5

Upvotes: 2

Views: 384

Answers (2)

hongbochen
hongbochen

Reputation: 123

Maybe this is caused by the pointer fp.Because every time you create the child thread,the child all uses the same fp,so sometimes it prints 4,sometimes it prints 5,sometimes it prints 3,also 2.

Upvotes: 0

user149341
user149341

Reputation:

When a file descriptor is shared between processes as the result of fork(), all attributes of the file descriptor are shared between the copies, including the current offset. As a result, all of the child processes in your program are trying to seek around and read data from the same file descriptor at the same time, causing all sorts of unexpected results.

To solve this problem, you will need to either delay opening the file until you have already forked, or dup() the file descriptor after forking to create an independent copy for each child process.

Upvotes: 3

Related Questions