Caching read-mostly values: are time lookups cheaper than atomic operations?

Question

My multithreaded app uses a bunch of read-mostly values. These values are configuration values, and only change when the operator edits the config file and instructs the app to reload the config file without downtime. The values are accessed from multiple threads. None of the threads mutate the value. Mutation only occurs when the config file is reloaded.

Because the values can change, accessing them requires some form of synchronization. But because they change so rarely, I do not want to use mutexes:

A normal mutex disallows multiple threads from accessing the values concurrently. Since the threads only read the values, concurrent access by the threads are safe as long as the config file isn't being reloaded.
A read-write mutex sounds like a good solution, but they have high constant overhead.

I could go lower level and use atomic operations directly. For example, I can make the Config object immutable, with an atomic pointer to the latest version:

Upon reloading the config file, I create a new Config object and atomically update the pointer to the currently-active Config object.
The reader threads would then just atomically load the pointer and use the Config object without synchronization, since it is immutable.

However, atomic operations themselves are not free either. I have a hard time finding information on what sort of overhead they exactly impose (CPU pipeline stalls? Some sort communication overhead between CPU cores? Do they limit concurrency? Not sure.) but I get the feeling that it's better to avoid them when possible.

So I got the idea of caching the config pointer for a limited amount of time, e.g. for 1 second. The cached pointer is accessed without synchronization. But this assumes that time lookups are less expensive and have less impact on concurrency than atomic pointer operations. Is this true?

So my main question is:

Are time lookups cheaper than atomic pointer operations?
- Does the overhead depend on the granularity? For example, are seconds-granularity time lookups cheaper than nanosecond-granularity?
- I am mostly interested in Linux, but information about other platforms are welcome too.

My secondary questions (for better understanding the problem) are:

What exactly is the overhead of a time lookup on various operating systems? What happens during a time lookup? Does the kernel need to be called (a system call)?
What exactly is the overhead of an atomic pointer load and store? What exactly happens in the CPU?

Additional information about my environment and use case:

I am using Go for this app, but I am interested in general, language-independent information. I also have another C++ app that is heavily concurrent, so I would appreciate answers that are as language-independent as possible.
My primary production platform is x86_64 Linux, so I am mostly interested in information about that platform. But since my app is used by a diverse range of users, I have to be at least aware of the caveats for other platforms, even if I don't particularly optimize for them.

Caching read-mostly values: are time lookups cheaper than atomic operations?

Answers (1)

Related Questions