mcro
mcro

Reputation: 53

fifo linux - write() function terminates the program abruptly

I'm implementing a pipe in C, where multiples producer programs (9 in my case) write data to one single consumer program.

The problem is that some producers (some times one or two) exit the program abruptly when calling the write() function.

The code is simple, here is the producer code:

#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <poll.h>

#define MSG_SIZE_BYTES 4

void send(unsigned int * msg){

    int fd, msg_size;
    int r;
    char buffer [5];
    char myfifo[50] = "/tmp/myfifo";

    fd = open(myfifo, O_WRONLY);

    if(fd == -1){
        perror("error open SEND to fifo");
    }

    r = write(fd, msg, MSG_SIZE_BYTES);

    if(r == -1){
        perror("error writing to fifo");
     }

    close(fd);
    printf("Message send\n");
}

int main(int argc, char *argv[]){
    int cluster_id = atoi(argv[1]);
    unsigned int msg[1];
    msg[0] = cluster_id;

    while(1){
        printf("Press a key to continue...\n");
        getchar();
        send(msg);
    }
}

And here is the consumer code

#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <poll.h>

#define MSG_SIZE_BYTES 4

int receive(unsigned int * received_msg){
    int fd, msg_size;
    int ret_code;
    char buffer [5];
    char myfifo[50] = "/tmp/myfifo";

    fd = open(myfifo, O_RDONLY);

    if(fd == -1) 
       perror("error open RECV to fifo");

    ret_code = read(fd, received_msg, MSG_SIZE_BYTES);

    close(fd);

    if (ret_code == -1){
        printf("\nERROR\n");    
        return 0;
    }

    return 1;
}

void main(){

    mkfifo("/tmp/myfifo", 0666);

    unsigned int msg[1];
    while(1){
       receive(msg);
       printf("receive msg from id %d\n", msg[0]);

    }
}

I'm compiling the producers and consumer with the following command: gcc -o my_progam my_program.c

To reproduce the problem, you need to open 9 terminals to run each producer and 1 terminal to run the consumer. Execute the consumer: ./consumer

Execute the producer in all terminals simultaneously, passing to each execution an associated ID passed by command line. Ex: ./producer 0, ./producer 1.

After the producer send messages some times (10 in average), one arbitrary producer will abruptly stop its execution, showing the problem.

The following image depicts the execution: Terminals ready to execute

The following image depicts the error on producer ID 3 Error on producer 3

Thanks in advance

Upvotes: 4

Views: 893

Answers (3)

Owl
Owl

Reputation: 1562

Note that this can also happen with networking, and it isn't always possible to fix the client / consumer.

Sometimes the problem is that when the write function is called, if there's a SIGPIPE, the write function will just exit the C program suddenly with absolutely no warning whatsoever. If you are debugging this with GDB it'll be quite clear, but otherwise it wont be so obvious what is happening.

To stop this from happening, add in the code signal(SIGPIPE,SIG_IGN), like this:

#include <signal.h>
...
int main(){
  // this suppresses the program exit behaviour on a SIGPIPE signal
  signal(SIGPIPE, SIG_IGN); 
  ...
  int result=write(...);
  if(result<0){
    puts("Write failed, but rather than the program exiting, you are reading this");
  }
}

According to man 2 signal SIG_IGN will tell the signal handler that rather than to exit the program, instead it will ignore the error. It is up to the programmer to read and parse the negative result on the write output, and handle this appropriately.

Upvotes: 0

mcro
mcro

Reputation: 53

Problem SOLVED:

The problem is that I was opening and closing the FIFO at each message, generating a Broken pipe in some write attempts. Removing the close() and inserting the open() function for BOTH producer and consumer at the begging of the code instead inside the loop solved the problem.

Here is the code of producer with the bug fixed:

#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <poll.h>

#define MSG_SIZE_BYTES 4

int my_fd;

void send(unsigned int * msg){

    int fd, msg_size;
    int r;
    char buffer [5];
    char myfifo[50] = "/tmp/myfifo"

    if(fd == -1){
        perror("error open SEND to fifo");
    }

    r = write(my_fd, msg, MSG_SIZE_BYTES);

    if(r == -1){
        perror("error writing to fifo");
     }

    //close(fd);
    printf("Message send\n");
}

int main(int argc, char *argv[]){
    int cluster_id = atoi(argv[1]);
    unsigned int msg[1];
    msg[0] = cluster_id;

    my_fd = open("/tmp/myfifo", O_WRONLY);

    while(1){
        printf("Press a key to continue...\n");
        getchar();
        send(msg);
    }
}

And here is the consumer code:

#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <poll.h>

#define MSG_SIZE_BYTES 4

int my_fd;

int receive(unsigned int * received_msg){
    int fd, msg_size;
    int ret_code;
    char buffer [5];
    char myfifo[50] = "/tmp/myfifo";

    if(fd == -1) 
       perror("error open RECV to fifo");

    ret_code = read(my_fd, received_msg, MSG_SIZE_BYTES);

    //close(fd);

    if (ret_code == -1){
        printf("\nERROR\n");    
        return 0;
    }

    return 1;
}

void main(){

    mkfifo("/tmp/myfifo", 0666);
    my_fd = open("/tmp/myfifo", O_RDONLY);

    unsigned int msg[1];

    while(1){
       receive(msg);
       printf("receive msg from id %d\n", msg[0]);

    }
}

Thank you all!!

Upvotes: 1

Ctx
Ctx

Reputation: 18410

It looks like the consumer program closes the reading end of the pipe after reading data:

fd = open(myfifo, O_RDONLY);

if(fd == -1){
     perror("error open RECV to fifo");
}
ret_code = read(fd, received_msg, MSG_SIZE_BYTES);

close(fd);

All other writers, which are currently trying to write() data (i.e. are blocked in the write()-syscall) now receive a SIGPIPE, which leads to program termination (if no other signal handling is specified).

Your consumer program may not close the filedescriptor while producers are writing. Just read the next datum without closing.

Upvotes: 4

Related Questions