Reputation: 17477
After reading Why memory reordering is not a problem on single core/processor machines?, I wrote a small multi-thread program which runs on a single-core CPU:
#include <pthread.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
#define THREAD_NUM 4
#define SUM_LOOP_SIZE 100000
uint64_t sum;
void *
thread(void *arg)
{
for (int i = 0; i < SUM_LOOP_SIZE; i++) {
sum++;
}
return NULL;
}
int
main()
{
pthread_t tid[THREAD_NUM];
uint64_t counter = 0;
while (1) {
counter++;
for (int i = 0; i < THREAD_NUM; i++) {
int ret = pthread_create(&tid[i], NULL, thread, NULL);
if (ret != 0) {
fprintf(stderr, "Create thread error: %s", strerror(ret));
return 1;
}
}
for (int i = 0; i < THREAD_NUM; i++) {
int ret = pthread_join(tid[i], NULL);
if (ret != 0) {
fprintf(stderr, "Join thread error: %s", strerror(ret));
return 1;
}
}
if (sum != THREAD_NUM * SUM_LOOP_SIZE) {
fprintf(stderr, "Exit after running %" PRIu64 " times, sum=%" PRIu64 "\n", counter, sum);
return 1;
}
sum = 0;
}
return 0;
}
The code is simple: 4
threads add 1
to global variable sum
for 100000
times, if the final result of sum
is not 400000
, exit the program. The running result is like this:
$ ./multi_thread_one_cpu
Exit after running 17273076 times, sum=200000
$ ./multi_thread_one_cpu
Exit after running 1539708 times, sum=100000
I know the reason should be "sum++
" is not "atomic read-modify-write" operation. But I am just wondering why the single-core CPU can't be "clever" enough to finish one "sum++
" operation, then processes another "sum++
" operation? Since the single-core CPU can perceive the whole program's state.
P.S., the CPU I used for test is ARM
, not x86
, not sure whether this matters or not.
Upvotes: 0
Views: 47