Table of Contents
I will try to explain some principles and methods for achieving mutual exclusion. This post is written entirely because of me and my need to understand correctly the different approaches in achieving mutual exclusion.
First we need to define some key words:
Race condition – Whenever two or more processes are trying to read/write the same data and the final result depends on who runs precisely when
Critical region / or / Critical section – The part of the program where shared resources are accessed and this could lead to race conditions
Mutual exclusion – State in which you make sure that if one process uses shared resource (part of his critical region) no other process will be permited/able to access the same resource
The way of achieving mutual exclusion …
1| Disabling interrupts
Usually process switching occurs during the CPU clock cycles by receiving interrupts. If a process temporary disable all interrupts, then he can get all of its “critical region job” done, because it wont get switched by the interrupt handler.
After the process does its job in critical region, he could enable interrupts again.
Downsides
- No effect on multi-cpu platforms
Disabling interrupts works only upon the current CPU which executes the disable instruction. If you run on multiprocessor system, then the interrupts will be disabled only for the current CPU and you are still unprotected from different process (running on different CPU) to access your critical region - Cannot rely to user processThis is too much power for user controlled process, cause there is always the risk of not enabling back interrupts.
2| Busy waiting by using Peterson’s algorithm
This is entirely software solution, which is explained here:
Downsides
- Uses busy waiting and wasting CPU cycles
- Priority Inversion problem
3| Busy waiting by using spinlocks
Spinlocks are fast and appropriate if you are going to wait just a little, because in other case, you will waste much of CPU, which will cause to ineffective work.
Downsides
- Uses busy waiting and wasting CPU cycles
- Priority Inversion problem
4| Using TSL(Test-set-lock) or the XCHG instruction
The TSL usage could be explained most easily with the following example, where we have defined two functions: “enter_critical_region” and “leave_critical_region” in Assembly language:
enter_critical_region: ; A "jump to" tag; function entry point. tsl reg, flag ; Test and Set Lock; flag is the ; shared variable; it is copied ; into the register reg and flag ; then atomically set to 1. cmp reg, #0 ; Was flag zero on entry_region? jnz enter_region ; Jump to enter_region if ; reg is non-zero; i.e., ; flag was non-zero on entry. ret ; Exit; i.e., flag was zero on ; entry. If we get here, tsl ; will have set it non-zero; thus, ; we have claimed the resource ; associated with flag. leave_critical_region: move flag, #0 ; store 0 in flag ret ; return to caller
While TSL is executing, memory bus is locked, so no other CPU or process could access the memory. This way our shared memory lock variable is guaranteed to be accessed only by the current process.
An implementation of TSL in x86 processors is the XCHG instruction. XCHG exchanges the values of two addresses and does it atomic. XCHG is used for low-level synchronization in all x86 intel cpus.
5| Semaphores
Semaphores are proposed by E. W. Dijkstra (1965) as a variable which is used to count the number of wakeups. He proposed to have two actions upon semaphores – up and down (generalization of sleep and wakeup).
Down operation – Checks if the semaphore value is above 0 , and if so it decrements it by 1. If the value is 0, the process is put to sleep, without doing it’s decrement. No process ever blocks doing an down
Up operation – The up operation, increments the count by 1. If there are processes in “sleep” state which hasn’t finished their down operations, one of them will be awaken by the kernel and it will decrement the upped to 1 value, back to 0. No process ever blocks doing an up.
Checking the value, changing it, and possible going to sleep are all done in one atomic operation. That’s atomicity is crucial for being able to solve synchronization problems.
Semaphores are using TSL variables to assure the lock of the semaphore.
There is a therm called “binary semaphore”, which is a semaphore that stays strictly in the 0..1 value range. Those type of semaphores is way more simple, and could be used for getting mutual exclusion.
6| Mutex
Mutex is a shared variable that can be in one of the two states: locked or unlocked. It is used to get access to a shared resource and make sure you are the only one inside.
If you try to acquire a mutex you have the following cases:
- The mutex is not locked – Then you get the lock and continue
- The mutex is already locked – Then you start blocking until the mutex unlocks
If there are multiple threads waiting for a locked mutex, whenever the mutex get unlocked, the CPU scheduler will decide, which thread to get the access.
It can be used entirely in user level threads, without the need of any kernel calls.
Mutexes are good under heavy contention, but continuously switching to the kernel is expensive.
7| Futex in Linux
Futex stands for “Fast user space mutex”. This is a linux feature much like mutex, which provides basic locking, but avoid going to the kernel unless it has to.
It compounds by the following two parts:
- Kernel service – Used for managing wait queue if it is needed
- User library – Talks to the kernel if there is a need
Whenever there is contend or need for a process to “wait” the resource, it calls the kernel service to add it, to the waiting queue for the futex.
Some historic info about futexes: