Kevin W Matthews
Kevin W Matthews

Reputation: 744

Is "benign" data race actually benign?

I'm working with some third-party library code that contains the following race, shown simplified:

struct Object
{
    void (*func)(void *);
    void *data;
};

void called_from_thread1(struct Object *obj)
{
    obj->func(obj->data);
}

void called_from_thread2(struct Object *obj, void (*func)(void *), void *data)
{
    obj->func = func;
    obj->data = data;
}

Access from both threads is unsynchronized, so in general I'd analyze this as:

In this specific use case, however, it happens that the values are always changed back to the default value. Something like this:

void default_func(void *data) { /* snip */ };
void *default_data = 0; // Actually something valid

struct Object object = {
    .func = default_func,
    .data = &default_data,
};

Both thread 1 and thread 2 are spawned referencing the object, and thread 2 sets the same object back to defaults similar to this:

called_from_thread1(&object);

// elsewhere...
called_from_thread2(&object, default_func, &default_data);

This is technically still a race (and the thread sanitizer reports it), but what are the possible consequences?

In this case, the function and data don't get out of sync (they're never actually changed), so logical inconsistency shouldn't be an issue.

Can the pointers be temporarily invalid as they are copied? For example, maybe copying a function pointer on an 8-bit processor could take two instructions if the address space is 16 bits wide? In this case I'm working with a BeagleBoneBlack (ARM Cortex A8?), and I'm guessing that pointers are copied in a single instruction and should always be accurate. However, I don't yet know enough about the processor to be certain.

What am I missing?

Thanks!

Upvotes: 2

Views: 417

Answers (1)

John Zwinck
John Zwinck

Reputation: 249113

You have 32-bit ARM processor and are reading/writing two 32-bit pointers, which are read and written atomically on 32-bit ARM. Here's a post about that: ARM: Is writing/reading from int atomic?

Given this, and the fact that the same values are being stored every time, there is no possibility for it to go wrong, if the generated code is as expected.

There may still however be a problem, because while loading and storing pointers are atomic, the C language does not permit data races, so your compiler could hypothetically notice this data race and compile something quite different to what you expect. This is highly unlikely if you are compiling without optimization, or if the code is not sufficiently inline for the compiler to notice what's going on. But if the compiler does see the race, it would be within the letter of the standard to generate code that does not do what you expect:

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

-- C Standard section 5.1.2.4, paragraph 25 [ISO/IEC 9899:2011]

I think you have three choices if you want to ensure correctness:

  1. Inspect the generated assembly code (every time the code or compiler version is changed) to verify the loads and stores are as expected.
  2. Replace these small pieces of C code with assembly code, where these operations are known to be legal.
  3. Change to code to avoid data races in C. This would be my preference, and as pointed out in a comment below, may not change the generated assembly.

See here, especially "2.4 Redundant writes": https://www.usenix.org/legacy/event/hotpar11/tech/final_files/Boehm.pdf

Upvotes: 3

Related Questions