C++ 什麼場景下一定要使用 seq_cst 內存序？

同步強度和執行成本的順序一樣（從弱到強）：

relaxed < consume < acquire/release < acq_rel < seq_cst

但其實他們有自己的特性

Memory Order	類型	同步保證	重排序限制	性能開銷	典型用途
relaxed	最弱	僅保證原子性	幾乎無限制	最低	計數器、狀態標誌
acquire	單向同步	讀屏障	後續讀寫不能重排到此操作前	中等	讀取共享數據、鎖的獲取
release	單向同步	寫屏障	之前的讀寫不能重排到此操作後	中等	寫入共享數據、鎖的釋放
acq_rel	雙向同步	讀寫屏障	前後讀寫都不能跨越此操作	較高	讀改寫操作(RMW)
seq_cst	最強	全序	所有操作嚴格有序	最高	需要嚴格順序的場景
consume	數據依賴	依賴序	僅限數據依賴的操作	低	指針/引用傳遞（不推薦使用）

只seq_cst相關的部分，如有需要補充我再寫。

簡單來說：
有任何Thread間的數據依賴，應該使用比relaxed更強的內存序例如: acq_rel 和 seq_cst

acq_rel 在某些架構上（如x86）的release幾乎無開銷

以下用一個經典的例子來展示 memory_order_acq_rel 和 memory_order_seq_cst 的關鍵區別：

// seq_cst 适合的场景：需要严格顺序的多线程操作
std::atomic<bool> x{false}, y{false};
std::atomic<int> z{0};

void thread1() {
    x.store(true, std::memory_order_seq_cst);
    if (y.load(std::memory_order_seq_cst)) {
        z.fetch_add(1, std::memory_order_seq_cst);
    }
}

void thread2() {
    y.store(true, std::memory_order_seq_cst);
    if (x.load(std::memory_order_seq_cst)) {
        z.fetch_add(1, std::memory_order_seq_cst);
    }
}

// Thread 1             // Thread 2
x.store(true)          y.store(true)
y.load()               x.load()

可能的 seq_cst 執行順序：
1. x.store -> y.store -> y.load -> x.load  (Thread 1 看到 y=true)
2. y.store -> x.store -> y.load -> x.load  (Thread 2 看到 x=true)
3. x.store -> y.load -> y.store -> x.load
4. y.store -> x.load -> x.store -> y.load

但不可能同時出現：
Thread 1: 看到 y=false
Thread 2: 看到 x=false

在 seq_cst 下，z 至少會增加一次。而在 acq_rel 下，因為沒有全序的保證，可能兩個線程都看不到對方的寫入。