My understanding of std::memory_order_acquire
and std::memory_order_release
is as follows:
Acquire means that no memory accesses which appear after the acquire fence can be reordered to before the fence.
Release means that no memory accesses which appear before the release fence can be reordered to after the fence.
What I don't understand is why with the C++11 atomics library in particular, the acquire fence is associated with load operations, while the release fence is associated with store operations.
To clarify, the C++11 <atomic>
library enables you to specify memory fences in two ways: either you can specify a fence as an extra argument to an atomic operation, like:
x.load(std::memory_order_acquire);
Or you can use std::memory_order_relaxed
and specify the fence separately, like:
x.load(std::memory_order_relaxed); std::atomic_thread_fence(std::memory_order_acquire);
What I don't understand is, given the above definitions of acquire and release, why does C++11 specifically associate acquire with load, and release with store? Yes, I've seen many of the examples that show how you can use an acquire/load with a release/store to synchronize between threads, but in general it seems that the idea of acquire fences (prevent memory reordering after statement) and release fences (prevent memory reordering before statement) is orthogonal to the idea of loads and stores.
So, why, for example, won't the compiler let me say:
x.store(10, std::memory_order_acquire);
I realize I can accomplish the above by using memory_order_relaxed
, and then a separate atomic_thread_fence(memory_order_acquire)
statement, but again, why can't I use store directly with memory_order_acquire
?
A possible use case for this might be if I want to ensure that some store, say x = 10
, happens before some other statement executes that might affect other threads.
2 Answers
Answers 1
Say I write some data, and then I write an indication that the data is now ready. It's imperative that no other thread who sees the indication that the data is ready not see the write of the data itself. So prior writes cannot move past that write.
Say I read that some data is ready. It's imperative that any reads I issue after seeing that take place after the read that saw that the data was ready. So subsequent reads cannot move behind that read.
So when you do a synchronized write, you typically need to make sure that all writes you did before that are visible to anyone who sees the synchronized write. And when you do a synchronized read, it's typically imperative that any reads you do after that take place after the synchronized read.
Or, to put it another way, an acquire is typically reading that you can take or access the resource, and subsequent reads and writes must not be moved before it. A release is typically writing that you are done with the resource, and preceding writes must not be moved to after it.
Answers 2
std::memory_order_acquire
fence only ensures all load operation after the fence is not reordered before any load operation before the fence, thus memory_order_acquire
cannot ensure the store is visible for other threads when after loads are executed. This is why memory_order_acquire
is not supported for store operation, you may need memory_order_seq_cst
to achieve the acquire of store.
As an alternative, you may say
x.store(10, std::memory_order_releaxed); x.load(std::memory_order_acquire); // this introduce a data dependency
to ensure all loads not reordered before the store. Again, the fence not work here.
Besides, memory order in atomic operation could be cheaper than a memory fence, because it only ensures the order relative to the atomic instruction, not all instruction before and after the fence.
See also formal description and explanation for detail.
0 comments:
Post a Comment