João
João

Reputation: 13

Thread issue using global variable

Im studying Thread's used in Linux and Operating Systems. I was doing a little exercise. The objective is to sum the value of one global variable and at the end look the result. And when I looked to the final result my mind just blow. The code is the following one

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <pthread.h>

int i = 5;

void *sum(int *info);

void *sum(int *info)
{
    //int *calc = info (what happened?)
    int calc = info;

    i = i + calc;

    return NULL;
}

int main()
{
    int rc = 0,status;
    int x = 5;

    pthread_t thread;

    pthread_t tid;
    pthread_attr_t attr;

    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

    rc = pthread_create(&thread, &attr, &sum, (void*)x);
    if (rc) 
    {              
        printf("ERROR; return code from pthread_create() is %d\n", rc);
        exit(-1);
    }

    rc = pthread_join(thread, (void **) &status);
    if (rc) 
    {
        printf("ERROR; return code from pthread_join() is %d\n", rc);
        exit(-1);
    }

    printf("FINAL:\nValue of i = %d\n",i);
    pthread_attr_destroy(&attr);
    pthread_exit(NULL);

    return 0;
}

If I put the variable calc in the sum function as int *cal then the final value of i is 25 (not the expected value). But if I put it as int calc then the i final value is 10 (my expected value in this this exercise). I dont understand how can the value of i be 25 when I put the variable calc as int *calc.

Upvotes: 0

Views: 1737

Answers (2)

hc6
hc6

Reputation: 183

The issue has nothing to do with threading or global variable, it's about C's pointer arithmetics.

You can get exact same result using the following code:

int main()
{
    int i = 5;
    int *j = 5;
    i = i + j;
    printf("%d\n", i); // this is 25
}

What happens here is that you assign pointer j to value 5, and "add 5" to that pointer. Adding 5 to a pointer is equivalent to adding enough space in memory to hold 5 objects this pointer points to. In this case, sizeof(int) is 4, so you are really adding 4*5, which is 20. Hence, the result is 25, or 5 + 4*5 = 25.

Another caveat, since sizeof(int) is machine dependent, your results may vary.

Let me give you another example to make this clearer:

int main()
{
    int i = 5;
    uint64_t *j = 5;
    i = i + j;
    printf("%d\n", i); // result is 45
}

Because sizeof(uint64_t) is 8, this is equivalent to adding 5*8 to the original value of 5, therefore the result is 5 + 5*8 = 45.

This code demonstrates many problems with type casting. "x" is first declared as "int", cast to a generic pointer "void*", and implicitly cast to "int*", then cast back to "int". These kinds of casting will definitely shoot yourself in the foot, as you have already shown here.

Upvotes: 1

Read some tutorial about pthreads. You cannot expect reproducible behavior when accessing and modifying a global data in several threads (without additional coding precautions related to synchronization). AFAIU your code exhibits some tricky undefined behavior and you should be scared (maybe it is only unspecified behavior in your case). To explain the observed concrete behavior you need to dive into implementation details (and you don't have time for that: studying the generated assembler code, the behavior of your particular hardware, etc...).

Also (since info is a pointer to an int)

int calc = info;

don't make a lot of sense (I guess you made some typo). On some systems (like my x86-64 running Linux), a pointer is wider than an int (so calc loses half of the bits from info). On other (rare) systems, it could be smaller in size. Somtimes (i686 running Linux) it might have the same size. You should consider intptr_t from <stdint.h> if you want to cast pointers to integral values and back.

Actually, you should protect the access to that global data (inside i, perhaps accessed thru a pointer) with a mutex, or use in C11 atomic operations since that data is used by several concurrent threads.

So you could declare a global mutex like

 pthread_mutext_t mtx = PTHREAD_MUTEX_INITIALIZER;

(or use pthread_mutex_init) then in your sum you would code

pthread_mutex_lock(&mtx);
i = i + calc;
pthread_mutex_unlock(&mtx);

(see also pthread_mutex_lock and pthread_mutex_lock(3p)). Of course you should code likewise in your main.

Locking a mutex is a bit expensive (typically, several dozens of times more than an addition), even in the case it was unlocked. You might consider atomic operations if you can code in C11, since you deal with integers. You'll declare atomic_int i; and you would use atomic_load and atomic_fetch_add on it.

If you are curious, see also pthreads(7) & futex(7).

Multi-threaded programming is really difficult (for all). You cannot expect behavior to be reproducible in general, and your code could apparently behave as expected and still be very wrong (and will work differently on some different system). Read also about memory models, CPU cache, cache coherence, concurrent computing...

Consider using GCC thread sanitizer instrumentation options and/or valgrind's helgrind

Upvotes: 3

Related Questions