How to synchronize threads/CPUs without mutexes if sequence of access is known to be safe?

Consider the following:

// these services are running on different threads that are started a long time ago
std::vector<io_service&> io_services;

struct A {
  std::unique_ptr<Object> u;
} a;

io_services[0].post([&io_services, &a] {
      std::unique_ptr<Object> o{new Object};

      a.u = std::move(o);

      io_services[1].post([&a] {
            // as far as I know changes to `u` isn't guaranteed to be seen in this thread 
            a.u->...;
          });
    });

The actual code passes a struct to a bunch of different boost::asio::io_service objects and each field of struct is filled by a different service object (the struct is never accessed from different io_service objects/threads at the same time, it is passed between the services by reference until the process is done).

As far as I know I always need some kind of explicit synchronization/memory flushing when I pass anything between threads even if there is no read/write race (as in simultaneous access). What is the way of correctly doing it in this case?

Note that Object does not belong to me and it is not trivially copy-able or movable. I could use a std::atomic<Object*> (if I am not wrong) but I would rather use the smart pointer. Is there a way to do that?

Edit: It seems like std::atomic_thread_fence is the tool for the job but I cannot really wrap the 'memory model' concepts to safely code. My understanding is that the following lines are needed for this code to work correctly. Is it really the case?

// these services are running on different threads that are started a long time ago
std::vector<io_service&> io_services;

struct A {
  std::unique_ptr<Object> u;
} a;

io_services[0].post([&io_services, &a] {
      std::unique_ptr<Object> o{new Object};

      a.u = std::move(o);

      std::atomic_thread_fence(std::memory_order_release);

      io_services[1].post([&a] {
            std::atomic_thread_fence(std::memory_order_acquire);

            a.u->...;
          });
    });

Upvotes: 3

Answers (2)

ildjarn

Reputation: 62975

^{(I'd like to remark that you appear to have changed your question in some significant way since @Smeeheey answered it; essentially, he answered your originally-worded question but cannot get credit for it since you asked two different questions. This is poor form – in the future, please just post a new question so the original answerer can get credit as due.)}

If multiple threads read/write a variable, even if you know said variable is accessed in a defined sequence, you must still inform the compiler of that. The correct way to do this necessarily involves synchronization, atomics, or something documented to perform one of the prior itself (such as std::thread::join). Presuming the synchronization route is both obvious in implementation and undesirable..:

Addressing this with atomics may simply consist of std::atomic_thread_fence; however, an acquire fence in C++ cannot synchronize-with a release fence alone – an actual atomic object must be modified. Consequently, if you want to use fences alone you'll need to specify std::memory_order_seq_cst; that done, your code will work as shown otherwise.

If you want to stick with release/acquire semantics, fortunately even the simplest atomic will do – std::atomic_flag:

std::vector<io_service&> io_services;

struct A {
  std::unique_ptr<Object> u;
} a;
std::atomic_flag a_initialized = ATOMIC_FLAG_INIT;

io_services[0].post([&io_services, &a, &a_initialized] {
    std::unique_ptr<Object> o{new Object};

    a_initialized.clear(std::memory_order_release); // initiates release sequence (RS)
    a.u = std::move(o);
    a_initialized.test_and_set(std::memory_order_relaxed); // continues RS

    io_services[1].post([&a, &a_initialized] {
        while (!a_initialized.test_and_set(std::memory_order_acquire)) ; // completes RS

        a.u->...;
    });
});

For information on release sequences, see here.

Upvotes: 1

Smeeheey

Reputation: 10316

Synchronisation is only needed when there would be a data race without it. A data race is defined as unsequenced access by different threads.

You have no such unsequenced access. The t.join() guarantees that all statements that follow are sequenced strictly after all statements that run as part of t. So no synchronisation is required.

ELABORATION: (To explain why thread::join has the above claimed properties) First, description of thread::join from standard [thread.thread.member]:

void join();

Requires: joinable() is true.

Effects: Blocks until the thread represented by *this has completed.

Synchronization: The completion of the thread represented by *this synchronizes with (1.10) the corresponding successful join() return.

a). The above shows that join() provides synchronisation (specifically: the completion of the thread represented by *this synchronises with the outer thread calling join()). Next [intro.multithread]:

An evaluation A inter-thread happens before an evaluation B if

(13.1) — A synchronizes with B, or ...

Which shows that, because of a), we have that the completion of t inter-thread happens before the return of the join() call.

Finally, [intro.multithread]:

Two actions are potentially concurrent if

(23.1) — they are performed by different threads, or

(23.2) — they are unsequenced, and at least one is performed by a signal handler.

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other ...

Above the required conditions for a data race are described. The situation with t.join() does not meet these conditions because, as shown, the completion of t does in fact happen-before the return of join().

So there is no data race, and all data accesses are guaranteed well-defined behaviour.

Upvotes: 3

How to synchronize threads/CPUs without mutexes if sequence of access is known to be safe?

Answers (2)

Related Questions