This page describes atomic operations using load/store exclusive instructions of a CPU.
...
One method of implementation could be to use the strictest memory ordering which is likely to have correct behavior but sacrifices performance. This is illustrated in the test and set example above using the *explicit functions with the memory_order_seq_cst memory order.
The following snippet from stdatomic.h illustrates the memory ordering which can be used based on the exact application of the atomic.
...
Atomics by nature require more instructions such that users must realize atomic variables will be slower than normal variables. The following code disassembly illustrates the code generated to declare and initialize an atomic variable and then increment the atomic variable. Notice that by default the code includes memory barriers (the dmb instructions) which will affect the performance.
...
Zynq 7K incorporates the exclusive access monitors in the DDR controller on each port of the controller. This likely indicates that a MicroBlaze in the PL and a Cortex A9 in the PS cannot use a shared DDR memory location for exclusive locking. This conclusion is only based on the documentation and has not been prototyped.
Alternatives
The Xilinx Mutex or Mailbox IP might be an alternative when atomics cannot be used with the required memory.