Diogo Parrinha
Diogo Parrinha

Reputation: 59

C read socket changes occasionally

So I have a web server which forks into sub-processes which will handle multiple clients through a multi thread system.

There is one problem though. Some times, accept() returns Bad File Descriptor! Using ab too test the web server, sometimes I get that error. So let's say the first 500 requests are processed fine but then, Bad File Descriptor - the socket fd seems like a real big integer suddenly (no longer a valid one).

I only close it once - at the end of the program so it doesn't make sense that it's suddenly getting closed - gdb breakpoints guarantee me that this is exiting properly.

I also have another issue though, which seems to be that a thread created by a child process exits "randomly". It's not random, I know but I can't find why it exists as nothing gets output to the screen. I'm saying it's "random" because helgrind tells me the exiting thread still has two locks holding (I have a mutex within a mutex (and need it) - I have also tried using one mutex only and the problem remains). Is there any possibility of something causing a thread to close?

I can provide the code but it's a bit extensive and it has comments in portuguese and variable/function names are in portuguese as well.

UPDATE

Here's a snippet of my code:

void *gere_processo(void*pst)
{
    process_struct pstruct = *(process_struct*)pst;

    int fork_id;
    int fd[2];

    if(pipe(fd)==-1)
    {
        perror("Erro a criar pipe: ");
        exit(-1);
    }

    // Começa a 0 antes de fazer fork!
    nthreads = 0;

    mainpid = getpid();

    printf("\nEntering thread %d\n\n", (unsigned int)pthread_self());

    //assim nao temos que usar mux mais abaixo quando fazemos uso destes valores na entrada da funcao
    pthread_mutex_lock(&processofilhomux);
    processofilho++;
    int localprocessofilho = processofilho;
    pthread_mutex_unlock(&processofilhomux);

    pthread_mutex_lock(&processopaimux);
    processopai++;
    int localprocessopai = processopai; 
    pthread_mutex_unlock(&processopaimux);

    fork_id=fork();
    if (fork_id==-1)
    {
        perror("Erro a criar filho:");
        exit(-1);
    }
    else if(fork_id==0)
    {
        if(signal(SIGUSR1, func_forcereadconfig) == SIG_ERR)
            printf("Erro no sinal SIGUSR1");
        if(signal(SIGINT, func_forcedsigint) == SIG_ERR)
            printf("Erro no sinal SIGINT");

        /***********************************************************************
        * Codigo do Filho
        * Espera clientes e cria uma thread para cada um.
        * Sempre que um cliente é criado, envia uma mensagem para o pai com o numero de clientes activos.
        * Sempre que um cliente é terminado, envia uma mensagem para o pai com o numero de clientes activos.
        *************************************************************************/

        int i, active_n=0;
        char address[BUFFSIZE];
        int msgsock;

        char nbuffer[50];

        pthread_attr_t attr;
        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);

        thread_s threadtable[MAXCLIENTSPERFORK];
        int a;
        for(a=0;a<MAXCLIENTSPERFORK;a++)
        {
            threadtable[a].thread = 0;
            threadtable[a].active = false;
            threadtable[a].t_buffer = NULL;
        }

        // Tempo inicial
        // Se passado X segundos nao existirem pedidos, o fork sai (desde que nao seja a primeiro child)
        struct timeval begin, now;
        gettimeofday(&begin, NULL);

        int timeout = 3;
        double timediff;

        while(cicle)
        {
            gettimeofday(&now , NULL);

            // Tempo que passou desde que iniciamos o timer
            timediff = (now.tv_sec - begin.tv_sec) + 1e-6 * (now.tv_usec - begin.tv_usec);

            //printf("\nServer (2) Socket: %d\n", pstruct.recsock);

            pthread_mutex_lock(&nthreadsmux);

            if(localprocessofilho == 2)
            {
                printf("\n[%d] - Tempo que passou: %lf - %d\n", localprocessofilho, timediff, nthreads);
            }

            // Condicoes para execuçao:
            // - nthreads = 0 e timeout < timediff (queremos tentar ate existir um timeout)
            // - nthreads < MAXCLIENTSPERFORK mas maior que zero
            // - nthreads < MAXCLIENTSPERFORK e processo filho é o primeiro (corre sempre)
            if((nthreads == 0 && timeout > timediff) || nthreads < MAXCLIENTSPERFORK && nthreads > 0 || (nthreads < MAXCLIENTSPERFORK && localprocessofilho == 1))
            {
                pthread_mutex_unlock(&nthreadsmux);

                // Espera um cliente
                printf("\tProcesso %d espera de clientes\n", localprocessofilho);
                if((msgsock = espera_pedido(pstruct.recsock, address)) < 0)
                    continue;

                pthread_mutex_lock(&athreadmux);

                // Figure out our thread number
                int a;
                for(a=0;a<MAXCLIENTSPERFORK;a++)
                {
                    if(threadtable[a].active == false)
                    {
                        if(threadtable[a].thread != 0)
                            pthread_join(threadtable[a].thread, NULL); // Temos que garantir que acaba...antes de a usar
                        break;
                    }
                }

                threadtable[a].t_buffer = malloc(sizeof(thread_buffer));
                if(threadtable[a].t_buffer == NULL)
                {
                    perror("\nErro (0) ao alocar estrutura config:");
                    exit(-1);
                }

                threadtable[a].t_buffer->msg = malloc(sizeof(char)*BUFFSIZE);
                if(threadtable[a].t_buffer->msg == NULL)
                {
                    perror("\nErro (0) ao alocar estrutura config:");
                    exit(-1);
                }

                sprintf(threadtable[a].t_buffer->msg, "%s", address);

                threadtable[a].t_buffer->sock = msgsock;
                threadtable[a].t_buffer->conf = conf;
                threadtable[a].active = true;

                // localtime é Non-threadsafe portanto temos que correr fora das threads criadas para cada pedido
                time_t timer = time(NULL);
                threadtable[a].t_buffer->t = *localtime(&timer);
                threadtable[a].t_buffer->timer = timer;

                // Cria nova threads
                if (pthread_create(&threadtable[a].thread, &attr, thread_func_pedido, (void*)&threadtable[a]) != 0)
                {
                    printf("\n\nERROR: %d", errno);
                    perror("Erro a criar thread: ");
                    close(msgsock);
                    free(threadtable[a].t_buffer);

                    printf("\n\n\n");
                    sleep(10);

                    continue;
                }

                pthread_mutex_unlock(&athreadmux);


                pthread_mutex_lock(&nthreadsmux);
                memset(nbuffer, '\0', sizeof(nbuffer));
                sprintf(nbuffer, "%d", nthreads);
                if(nthreads >= MAXCLIENTSPERFORK)
                {
                    printf("\nReached MAXCLIENTSPERFORK in Child %d\n", localprocessofilho);
                }
                pthread_mutex_unlock(&nthreadsmux);

                //printf("\nProcesso %d enviar para o pai: %s\n", localprocessofilho, nbuffer);
                write(fd[WRITE], nbuffer, (strlen(nbuffer)+1));

                // Reset ao timer
                gettimeofday(&begin, NULL);
            }
            else if(nthreads == 0 && localprocessofilho > 1)
            {
                pthread_mutex_unlock(&nthreadsmux);
                printf("\n00Reached 0 in Child %d\n", localprocessofilho);
                break;
            }
            else
            {
                pthread_mutex_unlock(&nthreadsmux);
            }
        }

        for(a=0;a<MAXCLIENTSPERFORK;a++)
        {
            if(threadtable[a].active != false)
            {
                pthread_join(threadtable[a].thread, NULL);
            }
        }

        printf("\nSAI FILHO\n\n");
        sleep(10);
        _exit(0);
    }
    else
    {
        /***********************************************************************
        * Codigo do Pai
        * O pai apenas lê o que o filho envia pelo pipe.
        * O filho envia constantemente o numero de threads activas.
        * Quando n atinge zero, o processo mata-se a si próprio o pai espera pela sua morte.
        * Quando n atingoe MAXCLIENTSPERTHREAD o processo mete wantnewfork a TRUE.
        *************************************************************************/
        close(fd[WRITE]);

        char buf[10];
        pid_t result;
        int status;
        // Verifica o estado do child (se == 0 entao esta a correr)
        // Se este for o processo pai, nao é suposto sairmos, NUNCA (a menos que haja um sinal)
        while((result = waitpid(fork_id, &status, WNOHANG)) == 0 || localprocessopai == 1)
        {
            int retn = read(fd[READ], buf, 10);
            if(retn > 0)
            {
                int n = atoi(buf);
                //printf("\n--- PAI: %d |\n", n);

                if(n == 0 && localprocessopai > 1)
                {
                    int status = 0;
                    wait(&status);
                    printf("\n%d A sair...\n", localprocessopai);
                    break;
                }
                else if(n == MAXCLIENTSPERFORK)
                {
                    printf("\n\n\nQUEREMOS UM NOVO PROCESSO - %d!!!\n\n\n", n);
                    pthread_mutex_lock(&processomux);
                    wantnewfork = true;
                    pthread_mutex_unlock(&processomux);
                }
            }
        }

        free(pst);

        // Está na hora de sair daqui
        int ret;
        printf("\n\n\n%d SAIR DA THREAD - %d - %d - %d\n\n", (unsigned int)pthread_self(), localprocessopai, result, WIFEXITED(status));
        pthread_exit(&ret);
    }
}


void *thread_func_pedido(void * threadstruct)
{
    thread_s *tstmp = (thread_s*)threadstruct;
    thread_buffer t = *(*tstmp).t_buffer;

    char buffer[BUFSIZE];

    char ver;

    pthread_mutex_lock(&nthreadsmux);
    nthreads++;
    pthread_mutex_unlock(&nthreadsmux);

    /*Leitura do Pedido*/
    int totalread;
    if((totalread = recv(t.sock, buffer, BUFSIZE,0)) <= 0)
    {
        printf("\nTotal read: %d\n", totalread);
        perror("Erro lendo a request: ");

        wait(10);

        close(t.sock);

        pthread_mutex_lock(&nthreadsmux);
        if(nthreads > 0)
            nthreads--;
        pthread_mutex_unlock(&nthreadsmux);

        int ret;
        pthread_exit(&ret);
    }

    req_data pedido;
    pedido = init_data();

    /* Processamento do Pedido */
    if(buffer == NULL)
    {
        pedido.errorcode=400;
    }
    else
    {
        pedido.request = convert_get(buffer,&pedido.errorcode,&ver);

        if(pedido.errorcode==200)
        {
            if(special_request(t.sock, pedido.request,ver) == false)
            {
                pedido.errorcode = imprime_ficheiro(t.sock, pedido.request, ver, t.conf->httpdocs, t.conf->cgibin);
            }
        }
    }

    if (pedido.errorcode != 200)
    {
        imprime_erro(t.sock, pedido.errorcode, ver, pedido.request);
    }

    /*--- Armazenamento do Pedido ---*/
    pedido = gen_data(pedido.errorcode, (char*)t.msg, pedido.request, tstmp->t_buffer->t, tstmp->t_buffer->timer);

    pthread_mutex_lock( &mux );
    FILE* statfile = (FILE*) stat_init();
    stat_armazena_req(pedido,statfile);
    fclose(statfile);
    pthread_mutex_unlock( &mux );

    pedido=free_data(pedido);

    //printf("\nClosing socket %d\n", t.sock);
    close(t.sock);

    pthread_mutex_lock(&nthreadsmux);
    pthread_mutex_lock(&processofilhomux);
    if(nthreads > 0)
        nthreads--;
    //printf("\n [%d] nthreads reduced to %d\n",processofilho, nthreads);
    pthread_mutex_unlock(&processofilhomux);
    pthread_mutex_unlock(&nthreadsmux);

    // Free the thread buffer, we're going to realloc it later
    free(tstmp->t_buffer->msg);
    free(tstmp->t_buffer);
    tstmp->t_buffer = NULL;

    pthread_mutex_lock(&athreadmux);
    tstmp->active = false;
    pthread_mutex_unlock(&athreadmux);

    int ret;
    pthread_exit(&ret);
}


int main(int argc, char * argv[]){


    // Mascara de Sinais 
    if(signal(SIGTERM, func_sigint) == SIG_ERR)
        printf("Erro no sinal SIGCONT");

    if(signal(SIGINT, func_sigint) == SIG_ERR)
        printf("Erro no sinal SIGINT");

    if(signal(SIGUSR1, func_readconfig) == SIG_ERR)
        printf("Erro no sinal SIGUSR1");


    // Socket
    sock = cria_socket(PORT);

    conf = read_config("./www.config");
    if(conf == NULL)
    {
        conf = (config*)malloc(sizeof(config));
        if(conf == NULL)
        {
            perror("\nErro (2) ao alocar estrutura config:");
            exit(-1);
        }

        // Set defaults
        sprintf(conf->httpdocs, DOCUMENT_ROOT);
        sprintf(conf->cgibin, CGI_ROOT);
    }

    /*
        Inicialmente é criado um processo que tratará de X threads/clientes
        Após cada cliente ser tratado, a respectiva thread é destruída
        Depois, caso a lista de threads esteja vazia, o processo mata-se a si próprio, informado o pai do sucedido.
        Só é criado um novo processo caso não existam processos livres (i.e. se houver muitos clientes concorrentes)
        O processo pai cria uma thread para cada processo filho criado,
        de forma a que estas consigam interagir com o mesmo através de um pipe
    */
    while (cicle)
    {
        //Aparentemente o ultimo processo ficou cheio, portanto temos que criar um novo
        pthread_mutex_lock(&processomux);
        if(wantnewfork)
        {
            pthread_mutex_unlock(&processomux);

            pthread_mutex_lock(&processopaimux);
            printf("\n-----------PROCESSO NUM: %d\n", processopai+1);
            pthread_mutex_unlock(&processopaimux);

            pthread_t forkthread;
            pthread_attr_t attr;
            pthread_attr_init(&attr);
            pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_DETACHED);


            process_struct *pstruct = malloc(sizeof(process_struct));
            if(pstruct == NULL)
            {
                perror("\nErro (0) ao alocar estrutura config:");
                exit(-1);
            }

            pstruct->recsock = sock;
            pstruct->conf = conf;

            //printf("\nServer (1) Socket: %d\n", pstruct->recsock);

            if (pthread_create(&forkthread, &attr, gere_processo, (void *)pstruct) != 0)
            {
                perror("Erro a criar thread gere_processo: ");
                exit(-1);
            }

            pthread_mutex_lock(&processomux);
            wantnewfork = false;
            pthread_mutex_unlock(&processomux);
        }
        else
            pthread_mutex_unlock(&processomux);

        usleep(10000);
    }

    close(sock);
    free(conf);

    printf("Sai Pai\n");
    return 0;
}

Update

I have a global variable named sock which holds the fd. The pstruct also has a member called revsock which has the same value as sock.

Apparently recsock is what gets changed (valgrind doesn't complain) but sock remains unchanged.

free(pst) might be the cause of it - though I'm unsure why it executes (it shouldn't for the main process - but that would explain the random exits and that would cause the bad file descriptor issue for random exits on threads which are not the 2nd thread).

I don't understand why free(pst) would cause this though...the child should have its own pst structure as well. But if I comment free(pst) I no longer have bad file descriptor issues. Still, free(pst) should be there, it's the "random thread exiting" that's causing it to run when it shouldn't.

Upvotes: 0

Views: 123

Answers (1)

Zan Lynx
Zan Lynx

Reputation: 54325

If the integer that represents the socket's file descriptor changes suddenly, and your program didn't intentionally change it, then you have one of two bugs:

  • You're using threads and you aren't locking your thread's access to shared memory. This might lead to things like using a pointer before it is set to a value. If that pointer is to memory with the fd in it, that could overwrite it.

  • You have a buffer overflow. Somewhere, your program is writing bad data into the memory that holds the file descriptor.

To solve the problem I recommend using a debugger and a hardware watchpoint to tell you when the value of the file descriptor changes.

Upvotes: 1

Related Questions