Building Working Spin Locks with Test-And-Set
In this lesson, you will learn the correct implementation of a lock, the test-and-set method.
We'll cover the following
Because disabling interrupts does not work on multiple processors, and because simple approaches using loads and stores (as you saw in the previous lesson) don’t work, system designers started to invent hardware support for locking. The earliest multiprocessor systems, such as
Test and set
The simplest bit of hardware support to understand is known as a test-and-set (or
int TestAndSet(int *old_ptr, int new) {int old = *old_ptr; // fetch old value at old_ptr*old_ptr = new; // store ’new’ into old_ptrreturn old; // return the old value}
ASIDE: DEKKER’S AND PETERSON’S ALGORITHMS
In the 1960s, Dijkstra posed the concurrency problem to his friends, and one of them, a mathematician named Theodorus Jozef Dekker, came up with
. Unlike the solutions discussed here, which use special hardware instructions and even OS support, Dekker’s algorithm uses just loads and stores (assuming they are atomic with respect to each other, which was true on early hardware). a solution “Cooperating sequential processes” by Edsger W. Dijkstra. 1968. Available online here: http://www.cs.utexas.edu/users/EWD/ewd01xx/EWD123.PDF. One of the early semi- nal papers. Discusses how Dijkstra posed the original concurrency problem, and Dekker’s solution. Dekker’s approach was later refined by
. Once again, just loads and stores are used, and the idea is to ensure that two threads never enter a critical section at the same time. Here is Peterson’s algorithm (for two threads); see if you can understand the code. What are the Peterson “Myths About the Mutual Exclusion Problem” by G.L. Peterson. Information Processing Letters, 12(3), pages 115–116, 1981. Peterson’s algorithm introduced here. flag
andturn
variables used for?int flag[2]; int turn; void init() { // indicate you intend to hold the lock w/ ’flag’ flag[0] = flag[1] = 0; // whose turn is it? (thread 0 or 1) turn = 0; } void lock() { // ’self’ is the thread ID of caller flag[self] = 1; // make it other thread’s turn turn = 1 - self; while ((flag[1-self] == 1) && (turn == 1 - self)) ; // spin-wait while it’s not your turn } void unlock() { // simply undo your intent flag[self] = 0; }
For some reason, developing locks that work without special hardware support became all the rage for a while, giving theory-types a lot of problems to work on. Of course, this line of work became quite useless when people realized it is much easier to assume a little hardware support (and indeed that support had been around from the earliest days of multiprocessing). Further, algorithms like the ones above don’t work on modern hardware (due to relaxed memory consistency models), thus making them even less useful than they were before. Yet more research relegated to the dustbin of history.
A spin lock
What the test-and-set instruction does is as follows. It returns the old value pointed to by the old_ptr
, and simultaneously updates said value to new
. The key, of course, is that this sequence of operations is performed atomically. The reason it is called “test and set” is that it enables you to “test” the old value (which is what is returned) while simultaneously “setting” the memory location to a new value; as it turns out, this slightly more powerful instruction is enough to build a simple spin lock, as we now examine in the code snippet below. Or better yet: figure it out first yourself!
typedef struct __lock_t {int flag;} lock_t;void init(lock_t *lock) {// 0: lock is available, 1: lock is heldlock->flag = 0;}void lock(lock_t *lock) {while (TestAndSet(&lock->flag, 1) == 1); // spin-wait (do nothing)}void unlock(lock_t *lock) {lock->flag = 0;}
Explanation
Let’s make sure you understand why this lock works. Imagine first the case where a thread calls lock()
and no other thread currently holds the lock; thus, flag
should be 0. When the thread calls TestAndSet(flag, 1)
, the routine will return the old value of flag
, which is 0; thus, the calling thread, which is testing the value of the flag, will not get caught spinning in the while loop and will acquire the lock. The thread will also atomically set the value to 1, thus indicating that the lock is now held. When the thread is finished with its critical section, it calls unlock()
to set the flag back to zero.
The second case you can imagine arises when one thread already has the lock held (i.e., flag
is 1). In this case, this thread will call lock()
and then call TestAndSet(flag, 1)
as well. This time, TestAndSet()
will return the old value at flag, which is 1 (because the lock is held), while simultaneously setting it to 1 again. As long as the lock is held by another thread, TestAndSet()
will repeatedly return 1, and thus this thread will spin and spin until the lock is finally released. When the flag is finally set to 0 by some other thread, this thread will call TestAndSet()
again, which will now return 0 while atomically setting the value to 1 and thus acquire the lock and enter the critical section.
By making both the test (of the old lock value) and set (of the new value) a single atomic operation, it is ensured that only one thread acquires the lock. And that’s how to build a working mutual exclusion primitive!
TIP: THINK ABOUT CONCURRENCY AS A MALICIOUS SCHEDULER
From this example, you might get a sense of the approach you need to take to understand concurrent execution. What you should try to do is to pretend you are a malicious scheduler, one that interrupts threads at the most inopportune of times in order to foil their feeble attempts at building synchronization primitives. What a mean scheduler you are! Although the exact sequence of interrupts may be improbable, it is possible, and that is all you need to demonstrate that a particular approach does not work. It can be useful to think maliciously! (at least, sometimes)
You may also now understand why this type of lock is usually referred to as a spin lock. It is the simplest type of lock to build, and simply spins, using CPU cycles, until the lock becomes available. To work correctly on a single processor, it requires a preemptive scheduler (i.e., one that will interrupt a thread via a timer, in order to run a different thread, from time to time). Without preemption, spin locks don’t make much sense on a single CPU, as a thread spinning on a CPU will never relinquish it.
Get hands-on with 1400+ tech skills courses.