Why does multi-thread program still need lock when running on a single-core CPU?

Question

After reading Why memory reordering is not a problem on single core/processor machines?, I wrote a small multi-thread program which runs on a single-core CPU:

#include 
#include 
#include 
#include 
#include 

#define THREAD_NUM 4
#define SUM_LOOP_SIZE 100000

uint64_t sum;

void *
thread(void *arg)
{
    for (int i = 0; i < SUM_LOOP_SIZE; i++) {
        sum++;
    }
    return NULL;
}

int
main()
{
    pthread_t tid[THREAD_NUM];
    uint64_t counter = 0;
    while (1) {
        counter++;
        for (int i = 0; i < THREAD_NUM; i++) {
            int ret = pthread_create(&tid[i], NULL, thread, NULL);
            if (ret != 0) {
                fprintf(stderr, "Create thread error: %s", strerror(ret));
                return 1;
            }
        }

        for (int i = 0; i < THREAD_NUM; i++) {
            int ret = pthread_join(tid[i], NULL);
            if (ret != 0) {
                fprintf(stderr, "Join thread error: %s", strerror(ret));
                return 1;
            }
        }

        if (sum != THREAD_NUM * SUM_LOOP_SIZE) {
            fprintf(stderr, "Exit after running %" PRIu64 " times, sum=%" PRIu64 "
", counter, sum);
            return 1;
        }

        sum = 0;
    }

    return 0;
}

The code is simple: 4 threads add 1 to global variable sum for 100000 times, if the final result of sum is not 400000, exit the program. The running result is like this:

$ ./multi_thread_one_cpu
Exit after running 17273076 times, sum=200000
$ ./multi_thread_one_cpu
Exit after running 1539708 times, sum=100000

I know the reason should be "sum++" is not "atomic read-modify-write" operation. But I am just wondering why the single-core CPU can't be "clever" enough to finish one "sum++" operation, then processes another "sum++" operation? Since the single-core CPU can perceive the whole program's state.

P.S., the CPU I used for test is ARM, not x86, not sure whether this matters or not.

Why does multi-thread program still need lock when running on a single-core CPU?

Answers (0)

Related Questions