Queue of Threads in pthread C - web server repsonse pipelining

Question

I have a working HTTP Apache-like web server implemented in C, and my problem is that I don't know how to initialize the queue (and therefore how to enqueue threads into it), mostly because I'm not sure how to check if there is a previous thread to join before proceeding with the current one.

The server can exploit pipeline requests to increase its response speed, using threads in a more sophisticated way: the web server can generate a new thread for each request for a new resource, and simultaneously prepare responses; however, since the resources must be returned to the client in the same order in which the requests were received by the server (FIFO), it will take a coordination phase between the various response threads.

This coordination phase is achieved by implementing a sort of "waiting room for the doctor" in which each patient, when entering, asks who was the last to arrive, keeps track of it and enters the doctor's office only when the person in front of him leaves. In this way, everyone has a partial view of the queue (cares for only one person) but this partial view allows a correct implementation of a FIFO queue.

Here is the description of what do I have to do:

Likewise, each new thread will have to store the identifier of the thread that handles the previous request and wait for its termination using the system call pthread_join (). The first thread, obviously, will not have to wait for anyone and the last thread will have to be waited by the main thread that handles the requests on that connection before closing the connection itself and returning to wait for new connection requests.

I am having trouble initializing properly the to_join data structure, mostly because I don't understand how to compute the index i of the thread to join.- how can I differenciate the first and last thread in an array of pointers?

Here is the code (I could only modify in between the TO BE DONE START and TO BE DONE END comments):

#include "incApache.h"

pthread_mutex_t accept_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mime_mutex = PTHREAD_MUTEX_INITIALIZER;

int client_sockets[MAX_CONNECTIONS]; /* for each connection, its socket FD */
int no_response_threads[MAX_CONNECTIONS]; /* for each connection, how many response threads */

pthread_t thread_ids[MAX_THREADS];
int connection_no[MAX_THREADS]; /* connection_no[i] >= 0 means that i-th thread belongs to connection connection_no[i] */
pthread_t *to_join[MAX_THREADS]; /* for each thread, the pointer to the previous (response) thread, if any */

int no_free_threads = MAX_THREADS - 2 * MAX_CONNECTIONS; /* each connection has one thread listening and one reserved for replies */
struct response_params thread_params[MAX_THREADS - MAX_CONNECTIONS]; /* params for the response threads (the first MAX_CONNECTIONS threads are waiting/parsing requests) */

pthread_mutex_t threads_mutex = PTHREAD_MUTEX_INITIALIZER; /* protects the access to thread-related data structures */

pthread_t thread_ids[MAX_CONNECTIONS];
int connection_no[MAX_CONNECTIONS];

void *client_connection_thread(void *vp) {
    int client_fd;
    struct sockaddr_storage client_addr;
    socklen_t addr_size;
    pthread_mutex_lock(&threads_mutex);
    int connection_no = *((int *) vp);

    /*** properly initialize the thread queue to_join ***/
/*** TO BE DONE 3.1 START ***/
        //to_join[0] = thread_ids[new_thread_idx];
    //pthread_t *first;     Am I perhaps supposed to initialize the to_join data structure as a queue with two pointers
    //pthread_t *last;      indicating the first and last element? How can I do it on an array of pointers?
/*** TO BE DONE 3.1 END ***/

    pthread_mutex_unlock(&threads_mutex);
#endif
    for (;;) {
        addr_size = sizeof(client_addr);
        pthread_mutex_lock(&accept_mutex);
        if ((client_fd = accept(listen_fd, (struct sockaddr *) &client_addr, &addr_size)) == -1)
            fail_errno("Cannot accept client connection");
        pthread_mutex_unlock(&accept_mutex);
        client_sockets[connection_no] = client_fd;
        char str[INET_ADDRSTRLEN];
        struct sockaddr_in *ipv4 = (struct sockaddr_in *) &client_addr;
        printf("Accepted connection from %s
", inet_ntop(AF_INET, &(ipv4->sin_addr), str, INET_ADDRSTRLEN));
        manage_http_requests(client_fd
                , connection_no);
    }
}

#pragma clang diagnostic pop
void send_resp_thread(int out_socket, int response_code, int cookie,
              int is_http1_0, int connection_idx, int new_thread_idx,
              char *filename, struct stat *stat_p)
{
    struct response_params *params =  thread_params + (new_thread_idx - MAX_CONNECTIONS);
    debug(" ... send_resp_thread(): idx=%lu
", (unsigned long)(params - thread_params));
    params->code = response_code;
    params->cookie = cookie;
    params->is_http1_0 = is_http1_0;
    params->filename = filename ? my_strdup(filename) : NULL;
    params->p_stat = stat_p;
    pthread_mutex_lock(&threads_mutex);
    connection_no[new_thread_idx] = connection_idx;
    debug(" ... send_resp_thread(): parameters set, conn_no=%d
", connection_idx);

    /*** enqueue the current thread in the "to_join" data structure ***/
/*** TO BE DONE 3.1 START ***/
    //Again, should I use a standard enqueue implementation? But then how would I keep track of the last node ot arrive?
/*** TO BE DONE 3.1 END ***/

    if (pthread_create(thread_ids + new_thread_idx, NULL, response_thread, connection_no + new_thread_idx))
        fail_errno("Could not create response thread");
    pthread_mutex_unlock(&threads_mutex);
    debug(" ... send_resp_thread(): new thread created
");
}

void *response_thread(void *vp)
{
    size_t thread_no = ((int *) vp) - connection_no;
    int connection_idx = *((int *) vp);
    debug(" ... response_thread() thread_no=%lu, conn_no=%d
", (unsigned long) thread_no, connection_idx);
    const size_t i = thread_no - MAX_CONNECTIONS;
    send_response(client_sockets[connection_idx],
              thread_params[i].code,
              thread_params[i].cookie,
              thread_params[i].is_http1_0,
              (int)thread_no,
              thread_params[i].filename,
              thread_params[i].p_stat);
    debug(" ... response_thread() freeing filename and stat
");
    free(thread_params[i].filename);
    free(thread_params[i].p_stat);
    return NULL;
}

John Bollinger · Accepted Answer

I am having trouble initializing properly the to_join data structure, mostly because I don't understand how to compute the index i of the thread to join.- how can I differenciate the first and last thread in an array of pointers?

Assignment is different from initialization, and operating on one element is different from operating on the whole array. As far as I can determine, you're not actually to initialize to_join in that function (so the comment is misleading). Instead, you're only to assign an appropriate value to a single element.

That analysis follows from my interpretation of the names, scope, and documentation comments of the various global variables and from the name, signature, and initial lines of the function in question:

it appears that the various arrays hold data pertaining to multiple threads of multiple connections, as the role of one of the file-scope connection_no arrays is to associate threads with connections.
it appears that the function is meant to be the thread-start function for connection-associated threads.
no thread started at a time when any other connection-associated threads are running should do anything other than set data pertaining to itself, lest it clobber data on which other threads and connections rely.

Now, as for the actual question -- how do you determine which thread the new one should join? You can't. At least, not relying only on the template code presented in the question, unmodified.^*

Hypothetically, if you could access the version of the connection_no array that associates threads with connections then you could use it to find the indexes of all threads associated with the current connection. You could then get their thread IDs from the corresponding thread_ids array (noting that there is another name collision here), and their join targets from the join_to array. The first thread for the connection is the one that does not join to another, and the last is the one that is not joined by any other. That analysis is not altogether straightforward, but there are no real tricks to it. Details are left as the exercise they are meant to be.

But even if the file-scope name collisions were resolved, you could not perform the above analysis because the file-scope connection_no array is shadowed by a local variable of the same name inside the whole area where you are permitted to insert code.^*

Note also that you appear to need to choose a thread index for the new thread, which in general will not be 0. It looks like you need to scan the thread_ids or connection_no array to find an available index.

^*Unless you cheat. I take the intent to be for you to insert code (only) into the body of the client_connection_thread function, but you could, in fact, split that function into two or more by inserting code into the designated area. If the second file-scope declarations of connection_no and thread_ids were assumed to be ignored or missing in practice, then splitting up the function could provide a workaround for the shadowing issue. For example:

    /*** properly initialize the thread queue to_join ***/
/*** TO BE DONE 3.1 START ***/

    return client_connection_thread_helper1(connection_no);
}  // end of function

// parameter 'con' is the number of this thread's connection
void *client_connection_thread_helper1(int con) {
    int my_index;
    // ... Find an available thread index (TODO: what if there isn't one?) ...
    thread_ids[my_index] = pthread_self();
    connection_no[my_index] = con;  // connection_no is not shadowed in this scope

    pthread_t *last = NULL;
    // ... Find the last (other) thread associated with connection 'con', if any ...
    // You can determine the first, too, but that does not appear to be required.

    to_join[my_index] = last;

    return client_connection_thread_helper2(con);
}

// A second additional function is required for the remaining bits of
// client_connection_thread(), because they need the local connection_no
void *client_connection_thread_helper2(int connection_no) {
    int client_fd;
    struct sockaddr_storage client_addr;
    socklen_t addr_size;

/*** TO BE DONE 3.1 END ***/


    pthread_mutex_unlock(&threads_mutex);

I suppose it is possible that figuring out the need and implementation for such function-splitting was intended to be part of the exercise, but that would be a dirty trick, and overall it seems more likely that the exercise is just poorly formed.

Queue of Threads in pthread C - web server repsonse pipelining

Answers (1)

Related Questions