ACoelho
ACoelho

Reputation: 341

Threads not working concurrently in C pthreads

So I have a quick sort algorithm that I want to run in two different threads, the idea is to have 2 independent halves of the array being sorted at once (this should not be a problem by quick sort nature of partition).

My code is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);

typedef struct {
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in\n");
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

The execution works and sorts perfectly, it does generates 2 new independent threads that sorts halves of the array each. The problem is, it does not do it at the same time. Running the program with a input of 100m random numbers:

done out 1
done out 2
done in
done in

real    0m47,464s
user    0m50,686s
sys     0m0,452s

But it takes about 25 seconds for the first 3 lines to appear, and ~25 for the last one, which indicates that the second thread is waiting for the first one to run.

In the htop console, it appears that in some point both run at the same time (this is backed at the fact that this program runs a bit faster than my normal one)

Finally, I understand that it is not safe to work simultaneously with this sort of data, but on this sorting example it should be fine.

Upvotes: 0

Views: 760

Answers (3)

David Schwartz
David Schwartz

Reputation: 182761

You don't divide the work fairly among the threads. To see this, modify your code as follows:

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1 (%d)\n", argument1.mid - argument1.ini + 1);
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2 (%d)\n", argument2.mid - argument2.ini + 1);

You will see that one thread tends to have about twice as much work as the other.

For example, here are a few runs using random data:

done out 1 (66474145)
done out 2 (33525855)

done out 1 (21794872)
done out 2 (78205128)

done out 1 (67867800)
done out 2 (32132200)

You should never divide your work into a small number of very big chunks and assign each chunk to a thread if you care about concurrency. Instead, create a queue of small tasks and let threads pull tasks from the queue as they finish.

Upvotes: 1

Alexander Guyer
Alexander Guyer

Reputation: 2193

The threads are running concurrently (well, not necessarily concurrently, but perceivably concurrently). The 25 second delay you are seeing is due to a bug in your quick sort (or perhaps the way you're sharing the list between your two threads). Essentially, thread 2 is assigned much more work than thread 1, and so it takes much longer to complete. Thread 2 is not simply executing "after" thread 1, or "waiting" for thread 1.

To prove this, I added an unsigned long* argument to quicksort and had it increment the value referenced by said pointer at each call (essentially counting the number of times each thread calls quicksort), and I ran it with 10M (not 100M) random values. Thread 1's count ended up at 3,851,991, and thread 2's count ended up at 9,693,697. Sure, there can be some small variation between the two counts due to the randomness in the generation of the list. But the difference is nearly a factor of 3, which is far more significant than you could possibly expect from slight random variations.

I suggest you try a different implementation of quicksort (one which is known to work). I also suggest being more careful with data sharing (make sure the two threads never access each others' data, especially without synchronization) to get a more accurate measure of timing; the last thing you want is for one thread to be sorting or unsorting the other thread's data.

Upvotes: 1

Rachid K.
Rachid K.

Reputation: 5211

The thread creation is not fair: thread#1 gets created before thread#2. Moreover, when thread#1 is created, it may run and preempt the main thread which may wait for it to give back the CPU to create and start thread#2. However, the thread running under the default SCHED_OTHER policy have an unpredictable behavior.

To add some predictability:

  • Make the threads start at the same time once created. Use a barrier to trigger a "go" for all the threads at the same time. Cf. pthread_barrier_init()
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);



pthread_barrier_t barrier;

typedef struct {
    int id;
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    pthread_barrier_wait(&barrier);
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in %d\n", a->id);
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.id = 1;
    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.id = 2;
    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    pthread_barrier_init(&barrier, NULL, 3);

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    // Start the threads
    pthread_barrier_wait(&barrier);

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);
    pthread_barrier_destroy(&barrier);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

Upvotes: 0

Related Questions