In this session, we'll be looking at the synchronization primitives available in ARMv8-M. By the end of this session, you should be able to explain the rationale for including exclusive access instructions in the ARM architecture and determine how to use those instructions. You'll be able to analyze the need for a Global Exclusive Monitoring in a system design, and also be able to determine when you need to add memory barriers when using exclusive access instructions. We'll start by looking at the problem we're trying to solve by including these new memory access instructions. The instructions and ARMv8-M and processor will execute, can either be categorized as memory accesses, that is loads and stores, or data processing instructions. Typically, we combine these two types of instructions together to create a read, modify, and right sequence. If we have other threads or other devices accessing the memory, then this sequence needs to be atomic for the algorithm to work correctly. If the memory is accessed in the middle of our read modify and right sequence, our updating might not be correct and it might not be safe. Here we can see an example where the section which needs to be made atomic has been highlighted. If our system only uses a single thread, then atomicity is implicit, assuming we discount interrupt handlers. But as soon as we introduce a multithreaded situation and any data which is shared between threads is vulnerable. Once one thread has read the data, it will perform an action based on the data it has read, maybe including writing back to modify that data. If another thread can modify the data before we do that, then that can corrupt the results of the operation. It effectively creates a race condition between the two threads. This race can violate the atomicity we will require and can also lead to unpredictable execution of our program. In a single core system, we can probably solve this problem just by disabling interrupts. Because that will prevent a new thread being scheduled. Although that may not be possible due to requirements to meet certain interrupt latencies. If we have a multi-core system, this solution won't work anyway. We're going to move on to look at alternative solutions to this problem. If you're familiar with higher level programming, you've probably come across the concepts of mutexes or semaphores, or other access control structures and these allow a programmer to define a critical section in code wrapped by claiming one of these access control tokens. In the example on the slide, you can see we're claiming a mutex before we begin accessing the data. Before we begin the read, modify, write pattern. We will only attempt the access if we can successfully claim the mutex and the fact that we own the mutex, which is a token which gives us mutually exclusive ownership of that memory location, will prevent other threads from accessing that memory. When we finished our operation, we released the mutex. This should be a pattern which is familiar to most people from high level languages. But what we're going to look at is, well, what does that mutex do in the ARM architecture? How do we implement the mutex? It's quite an interesting problem. Let's look at simple implementation of that mutex function. The mutex itself is a software token, so can represent that as a variable or a value in memory and to claim the mutex, we have to read on mutex value, check it's not currently locked. If it isn't currently locked, we can claim it. But we can do this by updating the value of our mutex variable in memory. Unfortunately, you can see we've devolved the problem into another read, modify, write pattern, which is vulnerable to the same problems we've just been looking at. If somebody else tries to claim the mutex at the same time as our code, and the access is interleave in just the right way, it could lead to a situation where both pieces of code think, may own the mutex. Might is clearly on say. There are number of ways we can solve this problem. It's possible to solve it entirely in software. This is a solved problem in computer science. You may want to go away and look up something like a ticket lock or Lamport's bakery algorithm. We can solve this entirely in software, usually at the expense of extra code or by consuming more memory. But common solution most or many computer architectures take is to implement special memory access instructions to solve this problem. One potential solution is to have an atomic read, modify, write instruction. It's a solution many architectures take. But this isn't available to us in ARMv8-M. But we do have special memory access instructions, which rather than providing full atomicity, can detect if the memory has been written to since we last read it. This allows us to safely perform the final update because we can abandon it if the memory has been updated since we read the initial value that our decision is being made upon.