This article tries to articulate my understanding of C++ memory and ordering model. I learned this by reading the book “C++ Concurrency in Action” and found the examples pretty helpful.
If time permitted, I might write more about this (as you see (1) in the title!). Let’s assume all the variables we talk about bellowing is some type of std::atomic<T>
.
Two important relationships
Happens-before
Happens-before relationship specifies which operations see the effects of which other operations
- Let’s start from the simplest case, single-thread:
- Given certain object, if one operation is sequenced before another, then it also happens-before it.
- That is, even in the same thread, operations on two different objects may be re-ordered if there is no ordering constraint that is defined for those two.
- Given certain object, if one operation is sequenced before another, then it also happens-before it.
- Inter-thread happens-before is relatively simple and relies on the synchronize-with relation.
- One important property is happens-before is a transitive relation.
Synchronize-with
The synchronize-with relationship is something you can get only between operations on atomic types.
The basic idea is is that, given certain definition of rule for atomic variable (x
), write operation (W
) synchronize-with read (R
) operation on x
. We can say that reading an atomic variable triggers a synchronization with one writing operation to that variable.
- The interesting thing comes from the fact that in computer, let’s say
x
is modified several times and thus have several intermediate versions,{x_v1, x_v2, ...}
, which version ofx
is actually seen byR
might be different across several runs, resulting in several possible legal sequences. - Then the interesting quesion is: what is regarded as legal sequences?
Three Ordering Levels
These three ordering level looks pretty familar w.r.t. any sequence that requires consistency. Like Database transaction, distributed system coordination.
As what we we can guess from the name, from the strongest to the weakest: Sequential Ordering > Acquire-Release Ordering > Relaxed Ordering
To summary:
- Sequential: Once a snapshot is seen by one thread, that is also valid view (may be seen) for all threads, including all the values that have been written.
- Acquire-Release: Once a value is written (v2), next read must see at least that value of or a version after that, and this invariant hollds for all the readers.
- Relaxed: Once a value is written (v1), eventually, it will be seen, but any thread can see any version since v1, e.g., R1 sees v0, R2 sees v3. The only guarantee is R1 cannot see v4 then v0; R2 cannot see v3 then v2.
One small trick is to differentiate the logic sequence vs. the one global truth. That is, once a value is written, reader must see that does not necessarily mean program will generate that v1,v2 sequence, because you cannot control reader is scheduled to be run exactly after writer’s writing. So when we see one example, it’s a good idea to work with these invariant to generate test cases; walk though and see if some result is possible.
Here is two good examples to guide through the understanding of these three ordering levels.
#include <atomic>
#include <cassert>
#include <thread>
#if (defined) SEQ_ORDER
#define mem_order_read std::memory_order_seq_cst
#define mem_order_write std::memory_order_seq_cst
#elif (defined) ACQUIRE_RELASE
#define mem_order_read std::memory_order_acquire
#define mem_order_write std::memory_order_release
#else // RELAXED
#define mem_order_read std::memory_order_relaxed
#define mem_order_write std::memory_order_relaxed
#endif
std::atomic<bool> x, y;
std::atomic<int> val;
/**
* Example-1
* - Sequential: assertion not fire
* - Acquire-Release: assertion can fire
* - Relaxed: assertion can fire
*/
void write_x() { x.store(true, mem_order_write); }
void write_y() { x.store(true, mem_order_write); }
void read_x_then_y() {
while (!x.load(mem_order_read))
;if (y.load(mem_order_read)) {
val++;
}
}
void read_y_then_x() {
while (!y.load(mem_order_read))
;if (x.load(mem_order_read)) {
val++;
}
}
int main() {
false;
x = y = 0;
val = std::thread t1(write_x);
std::thread t2(write_y);
std::thread t3(read_x_then_y);
std::thread t4(read_y_then_x);
t1.join(); t2.join(); t3.join(); t4.join();assert(val.load() != 0); // fire or not?
}
/**
* Example-2
* sequential: will not fire
* Acquire-Release: will not fire
* Relaxed: may fire
*/
void write_x_then_y() {
true, mem_order_write);
x.store(true, mem_order_write);
y.store(
}
void read_y_then_x() {
while(!y.load(mem_order_read));
if(x.load(mem_order_read)) {
val++;
}
}
int main() {
false;
x = y = 0;
val = std::thread t1(write_x_then_y);
std::thread t2(read_y_then_x);
t1.join(); t2.join();assert(val.load() != 0); // fire or not?
return 0;
}
Sequential Consistent (total ordering)
- Sequential consistency implies a total order.
- The behavior of the program is consistent with a single sequential view of the world.
- If all operations on instances of atomic types are sequentially consistent, the behavior of a multithreaded program is as if all these operations were performed in some particular sequence by a single thread.
- ALL threads must see the same order of operations.
- If your code has one atomic operation before another in one thread, that ordering must be seen by other threads. (even though to different variables)
- It is the default ordering guarantee of those atomic variables’ overloaded member functions.
- From the programming level, it can be explicitly stated as
std::memory_order_seq_cst
.
Acquire-release Ordering
- A release operation synchronize-with an acquire operation that reads the value written.
- A pair of order:
std::memory_order_acquire
,std::memory_order_release
. - A release operation synchronize-with an acquire operation that reads the value written.
- In order to see the benefit of acquire-release ordering, you need to consider two stores from the same thread (exmple-2).
- We compare the above two by drawing the happen-before diagram, and the difference is rather obvious.
- Legend - arrow: happen-before; gray text: return value;
- Note the red dashed arrow is the key.
Relaxed Ordering
- Operations on atomic types performed with relaxed ordering don’t participate in synchronize-with relationships.
- It is
std::memory_order_relaxed
. - Operations on the same variable within a single thread still obey happens-before relationships, but there’s almost no requirement on ordering relative to other threads.
- The only requirement is that accesses to a single atomic variable from the same thread can’t be reordered.
- once a given thread has seen a particular value of an atomic variable, a subsequent read by that thread can’t retrieve an ealier value of an atomic variable.
Finally, thanks @lyf for pushing this to be public. :)
Happy new year!