Wei Shi
Wei Shi

Reputation: 5055

Read/Write an int on x86 machine without lock

Suppose in a C program I have P threads running on a 32 bit machine, and int MAX--a shared 32-bit integer

Each thread can read/write to MAX.

Requirement: the value a thread read should not be corrupted, e.g first 16bit and last 16bit are out of sync

Question: Do I need a lock to protect the read and write? or Can I safely ignore the lock, because LOAD/SAVE assembly instruction is guaranteed to happen atomically?

Upvotes: 5

Views: 4153

Answers (5)

Jens Gustedt
Jens Gustedt

Reputation: 78943

Most (all?) modern CPU have special instructions that guarantee atomic access to such type of variables. C has now a standardized interface to these instructions. Unfortunately this is not completely implemented by most compilers, yet.

The time being you should use extensions. gcc e.g has a whole bunch of them. If you look arround on the web you also should easily find implementations with inline assembly instructions that you could use with other compilers.

Upvotes: 3

Hans Passant
Hans Passant

Reputation: 941942

Reads and writes are atomic when the int is aligned properly. It cannot straddle the end of a cache line. A cache line is 64 bytes. Most any compiler ensures the alignment is taken care of but it can be overridden with, say, a structure packing pragma.

Yes, you need a lock to protect the value when threads perform a read-modify-write operation. You can get a cheap one from InterlockedXxxx, perhaps.

Upvotes: 4

user82238
user82238

Reputation:

A problem here is read-modify-write.

Say you have two seperate physical cores (they can be in the same physical package). They both read. The first core modifies - which is to say, increments the value, which is currently held in a register. At this point, the modification begins to propagate out to the caches - but the second core in the meantime has also modified the value and also began to propagate the cache out to cache.

You lose one of the updates.

Cache coherency protocols do not handle the case of multiple concurrent writes. There is nothing which causes one core to wait on its write because another core is also -currently writing-; because that information is simply not publically available between the cores. They -cannot- do it.

They do handle multiple consecutive writes, e.g. after the changes have been seen on a cores external bus pins (e.g. become public knowledge, rather than being internal to the core).

Another problem is instruction re-ordering. These threads - if they are running on different cores, their instruction re-ordering will not pay attention to what the other threads are doing; only to what that thread in particular is doing.

Imagine one thread is going to write the value and then set a flag. Another thread will see the flag raised and then read the value. Those threads, if on seperate cores, will re-order their instruction flow only with regard to themselves - the cores will not consider the other threads, because they cannot - they have no knowledge of them.

So in the first thread the flag setting may be re-ordered to occur before the writing of the value - after all, for just that thread, that re-ordering is fine. Those two variables are entirely disconnected. There is no ordering dependency. The dependency exists in another thread which is on a different core and so our core can't know about it.

The other thread of course will see the flag raised and read, even though the write actually hasn't happened yet.

Upvotes: 4

Brendan
Brendan

Reputation: 37212

Different architectures/CPUs have different atomicity guarantees. C is intended to be portable, therefore you shouldn't make any assumptions about which atomicity guarantees are made by any specific architectures/CPUs. This means that even for atomic reads/writes you should use some sort of abstraction (e.g. a library that provides the necessary atomic operations for each different architecture).

As far as I know (which isn't very much - most of my experience is with 80x86 only), for most architectures, reads and writes to aligned addresses that are less than some minimum size are usually guaranteed to be atomic (where "some minimum size" may be the size of a general purpose register, the size of a cache line, or something else).

This DOES NOT include modifications (e.g. instructions/operations that read, modify, then write to an address). For an "int MAX" variable (as opposed to something like "const int MAX = 1234"), I'd assume you'd want to do something like "if(foo > MAX) MAX = foo;" and you'd need a more robust atomic operation (e.g. maybe an atomic "compare and swap" in a loop that retries if the comparison was false).

Also, don't forget about declaring your variable as "volatile".

  • Brendan

Upvotes: 3

Steve-o
Steve-o

Reputation: 12866

No, word sized loads and stores are atomic. However it you want to perform an addition or other arithmetic a lock would be required as they may use a read and a write.

Upvotes: 3

Related Questions