Deadlock Bugs
This lesson introduces deadlocks with examples and then presents conditions for a deadlock to occur.
We'll cover the following
Beyond the concurrency bugs mentioned in the previous lesson, a classic problem that arises in many concurrent systems with complex locking protocols is known as deadlock. Deadlock occurs, for example, when a thread (say Thread 1) is holding a lock (L1
) and waiting for another one (L2
); unfortunately, the thread (Thread 2) that holds lock L2
is waiting for L1
to be released. Here is a code snippet that demonstrates such a potential deadlock:
Thread 1: Thread 2:pthread_mutex_lock(L1); pthread_mutex_lock(L2);pthread_mutex_lock(L2); pthread_mutex_lock(L1);
Note that if this code runs, deadlock does not necessarily occur; rather, it may occur if, for example, Thread 1 grabs lock L1
and then a context switch occurs to Thread 2. At that point, Thread 2 grabs L2
, and tries to acquire L1
. Thus we have a deadlock, as each thread is waiting for the other and neither can run. See the figure below for a graphical depiction; the presence of a cycle in the graph is indicative of the deadlock.
The figure should make the problem clear. How should programmers write code so as to handle deadlock in some way?
CRUX: HOW TO DEAL WITH DEADLOCK
How should we build systems to prevent, avoid, or at least detect and recover from deadlock? Is this a real problem in systems today?
Why do deadlocks occur?
As you may be thinking, simple deadlocks such as the one above seem readily avoidable. For example, if Thread 1 and 2 both made sure to grab locks in the same order, the deadlock would never arise. So why do deadlocks happen?
One reason is that in large codebases, complex dependencies arise between components. Take the operating system, for example. The virtual memory system might need to access the file system in order to page in a block from disk; the file system might subsequently require a page of memory to read the block into and thus contact the virtual memory system. Thus, the design of locking strategies in large systems must be carefully done to avoid deadlock in the case of circular dependencies that may occur naturally in the code.
Another reason is due to the nature of encapsulation. As software developers are taught to hide details of implementations and thus make software easier to build in a modular way. Unfortunately, such modularity does not mesh well with locking. As Vector
class and the method AddAll()
. This routine would be called as follows:
Vector v1, v2;v1.AddAll(v2);
Internally, because the method needs to be multi-thread safe, locks for both the vector being added to (v1
) and the parameter (v2
) need to be acquired. The routine acquires said locks in some arbitrary order (say v1
then v2
) in order to add the contents of v2
to v1
. Suppose some other thread calls v2.AddAll(v1)
at nearly the same time, we have the potential for deadlock, all in a way that is quite hidden from the calling application.
Conditions for deadlock
-
Mutual exclusion: Threads claim exclusive control of resources that they require (e.g., a thread grabs a lock).
-
Hold-and-wait: Threads hold resources allocated to them(e.g., locks that they have already acquired) while waiting for additional resources (e.g., locks that they wish to acquire).
-
No preemption: Resources (e.g., locks) cannot be forcibly removed from threads that are holding them.
-
Circular wait: There exists a circular chain of threads such that each thread holds one or more resources (e.g., locks) that are being requested by the next thread in the chain.
If any of these four conditions are not met, deadlock cannot occur. Thus, we first explore techniques to prevent deadlock; each of these strategies seeks to prevent one of the above conditions from arising and thus is one approach to handling the deadlock problem.
Get hands-on with 1400+ tech skills courses.