Kam
Kam

Reputation: 6008

Thread safe programming

I keep hearing about thread safe. What is that exactly and how and where can I learn to program thread safe code?

Also, assume I have 2 threads, one that writes to a structure and another one that reads from it. Is that dangerous in any way? Is there anything I should look for? I don't think it is a problem. Both threads will not (well can't ) be accessing the struct at the exact same time..

Also, can someone please tell me how in this example : https://stackoverflow.com/a/5125493/1248779 we are doing a better job in concurrency issues. I don't get it.

Upvotes: 7

Views: 5991

Answers (7)

djna
djna

Reputation: 55907

Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.

Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.

The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.

In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:

  { foreground: red; background: black }

and the writer is in the process of changing those

   foreground = black;
            <=== window of opportunity
   background = red;

If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination

  { foreground: black; background: black }

This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.

Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.

Upvotes: 4

Anthony Williams
Anthony Williams

Reputation: 68581

Thread safety is a simple concept: is it "safe" to perform operation A on one thread whilst another thread is performing operation B, which may or may not be the same as operation A. This can be extended to cover many threads. In this context, "safe" means:

  • No undefined behaviour
  • All invariants of the data structures are guaranteed to be observed by the threads

The actual operations A and B are important. If two threads both read a plain int variable, then this is fine. However, if any thread may write to that variable, and there is no synchronization to ensure that the read and write cannot happen together, then you have a data race, which is undefined behaviour, and this is not thread safe.

This applies equally to the scenario you asked about: unless you have taken special precautions, then it is not safe to have one thread read from a structure at the same time as another thread writes to it. If you can guarantee that the threads cannot access the data structure at the same time, through some form of synchronization such as a mutex, critical section, semaphore or event, then there is not a problem.

You can use things like mutexes and critical sections to prevent concurrent access to some data, so that the writing thread is the only thread accessing the data when it is writing, and the reading thread is the only thread accessing the data when it is reading, thus providing the guarantee I just mentioned. This therefore avoids the undefined behaviour mentioned above.

However, you still need to ensure that your code is safe in the wider context: if you need to modify more than one variable then you need to hold the lock on the mutex across the whole operation rather than for each individual access, otherwise you may find that the invariants of your data structure may not be observed by other threads.

It is also possible that a data structure may be thread safe for some operations but not others. For example, a single-producer single-consumer queue will be OK if one thread is pushing items on the queue and another is popping items off the queue, but will break if two threads are pushing items, or two threads are popping items.

In the example you reference, the point is that global variables are implicitly shared between all threads, and therefore all accesses must be protected by some form of synchronization (such as a mutex) if any thread can modify them. On the other hand, if you have a separate copy of the data for each thread, then that thread can modify its copy without worrying about concurrent access from any other thread, and no synchronization is required. Of course, you always need synchronization if two or more threads are going to operate on the same data.

My book, C++ Concurrency in Action covers what it means for things to be thread safe, how to design thread safe data structures, and the C++ synchronization primitives used for the purpose, such as std::mutex.

Upvotes: 1

Philipp
Philipp

Reputation: 11813

To answer the second part of the question: Imagine two threads both accessing std::vector<int> data:

//first thread
if (data.size() > 0)
{
   std::cout << data[0]; //fails if data.size() == 0
}

//second thread
if (rand() % 5 == 0)
{
   data.clear();
}
else
{
   data.push_back(1);
}

Run these threads in parallel and your program will crash because std::cout << data[0]; might be executed directly after data.clear();.

You need to know that at any point of your thread code, the thread might be interrupted, e.g. after checking that (data.size() > 0), and another thread could become active. Although the first thread looks correct in a single threaded app, it's not in a multi-threaded program.

Upvotes: 0

justin
justin

Reputation: 104698

what is that exactly?

Briefly, a program that may be executed in a concurrent context without errors related to concurrency.

If ThreadA and ThreadB read and/or write data without errors and use proper synchronization, then the program may be threadsafe. It's a design choice -- making an object threadsafe can be accomplished a number of ways, and more complex types may be threadsafe using combinations of these techniques.

and how and where can I learn to program thread safe code?

boost/libs/thread/ would likely be a good introduction. The topic is quite complex.

The C++11 standard library provides implementations for locks, atomics and threads -- any well written programs which use these would be a good read. The standard library was modeled after boost's implementation.

also, assume I have 2 threads one that writes to a structure and another one that reads from it. Is that dangerous in any way? is there anything I should look for?

Yes, it can be dangerous and/or may produce incorrect results. Just imagine that a thread may run out of its time at any point, and then another thread could then read or modify that structure -- if you have not protected it, it may be in the middle of an update. A common solution is a lock, which can be used to prevent another thread from accessing shared resources during reads/writes.

Upvotes: 3

Joe
Joe

Reputation: 2976

It's a very deep topic. At the heart threads are usually about making things go fast by using multiple cores at the same time; or about doing long operations in the background when you don't have a good way to interleave the operation with a 'primary' thread. The latter being very common in UI programming.

Your scenario is one of the classic trouble spots, and one of the first people run into. It's vary rare to have a struct where the members are truly independent. It's very common to want to modify multiple values in the structure to maintain consistency. Without any precautions it is very possible to modify the first value, then have the other thread read the struct and operate on it before the second value has been written.

Simple example would be a 'point' struct for 2d graphics. You'd like to move the point from [2,2] to [5,6]. If you had a different thread drawing a line to that point you could end up drawing to [5,2] very easily.

This is the tip of the iceberg really. There are lots of great books, but learning this space usually goes something like this:

  1. Uh oh, I just read from that thing in an inconsistent state.
  2. Uh oh, I just modified that thing from 2 threads and now it's garbage.
  3. Yay! I learned about locks
  4. Whoa, I have a lot of locks and everything seems to just hang sometimes when I have lots of them locking in nested code.
  5. Hrm. I need to stop doing this locking on the fly, I seem to be missing a lot of places; so I should encapsulate them in a data structure.
  6. That data structure thing was great, but now I seem to be locking all the time and my code is just as slow as a single thread.
  7. condition variables are weird
  8. It's fast because I got clever with how I lock things. Hrm. Sometimes data corrupts.
  9. Whoa.... InterlockedWhatDidYouSay?
  10. Hey, look no lock, I do this thing called a spin lock.
  11. Condition variables. Hrm... I see.
  12. You know what, how about I just start thinking about how to operate on this stuff in completely independent ways, pipelineing my operations, and having as few cross thread dependencies as possible...

Obviously it's not all about condition variables. But there are many problems that can be solved with threading, and probably almost as many ways to do it, and even more ways to do it wrong.

Upvotes: 7

giorashc
giorashc

Reputation: 13713

Threads safe is when a certain block of code is protected from being accessed by more than one thread. Meaning that the data manipulated always stays in a consistent state.

A common example is the producer consumer problem where one thread reads from a data structure while another thread writes to the same data structure : Detailed explanation

Upvotes: 0

Stefan Birladeanu
Stefan Birladeanu

Reputation: 294

When writing multithreaded C++ programs on WIN32 platforms, you need to protect certain shared objects so that only one thread can access them at any given time from different threads. You can use 5 system functions to achieve this. They are InitializeCriticalSection, EnterCriticalSection, TryEnterCriticalSection, LeaveCriticalSection, and DeleteCriticalSection.

Also maybe this links can help: how to make an application thread safe?

http://www.codeproject.com/Articles/1779/Making-your-C-code-thread-safe

Upvotes: 1

Related Questions