./ deadlock.html
COP 4610: OPERATING SYSTEMS & CONCURRENT PROGRAMMING

Deadlock

 

Topics


Deadlock


Example

Dining Philosophers Problem

We have separately discussed the Dining Philosophers problem, demonstrated that it can result in deadlock, and shown two ways to avoid deadlock in solutions to that problem. We will now look at some more examples, analyze more carefully what deadlock means, and look at a broader range of solutions.


Traffic Deadlock Example from Stallings' Text


The figure, which is of a traffic deadlock, makes the nature of deadlock very clear. None of the cars can pass unless at least one of them backs up or is removed.


Types of Resources


Deadlock comes from contention between processes (or threads) for resources. So, to understand deadlock we need to understand resources.

One author of an operating sytems textbook has proposed that a resource be defined as "something a process can wait for".

A first step in understanding resources is to recognize that they can be divided into two kinds, according to whether they can be reused or not.


Reusable Resources


The most is known about reusable resources. Deadlock is easy to analyze because of there being a known finite number of resources.

You can think of reusable resources as collections, each of which contains one or more units. At a high level of abstraction, all the different modes of access can be modeled in terms of exclusive access to different numbers of units of a resource.

In the simplest case, a single-unit resource, the collection has just one unit. An example of a single-unit resource is a mutex. Only one thread may hold a mutex at a time. The only mode of access is exclusive.

We have previously seen the readers-writers synchronization problem. Reader-writer access to a resource can be modeled by pretending there are "tickets" for the resource, sufficiently many that every reader can have a ticket. These tickets are the logical units of the resource. For a reader to access the resource it needs only one ticket. For a writer to access the resource, it needs all the tickets. Assuming the tickets are accessed in exclusive mode, this models gives the correct behavior for reader-writer locks. That is, any number of concurrent readers can obtain tickets to access the resource, but cannot succeed in getting all the tickets so long as another reader or another writer is holding a ticket.

Unrestricted sharing can also be modeled, by pretending there is a very large number of tickets for the resource, and only a single ticket is required to access it. However, unrestricted sharing means no waiting, and so is not a mode we have to consider when we study deadlock.

The steps by which a process (or thread) gets a resource acquires and releases a resource involve an alternation of control between the process and the operating system (or resource manager). The first step is for the process to ask for (request) the resource. This is at the initiative of the process. The second step is for the operating system (or resource manager) to grant the resource. This is the only step that is under control of for the operating system, and the only aspects that the operating system can control is when (if ever) to grant the request. After the resource has been granted, the process is said to have acquired it. The third step is for the process to release the resource. This is agin under the control of the process.

This alternation can be viewed like a game, in which the players alternate moves. In the deadlock game the operating system loses if a deadlock develops.

Questions:

What are examples of reusable resources?

Processors, I/O channels, main and secondary memory, files, databases, and semaphores


Consumable Resources


Deadlocks involving consumable resources are harder to analyze than those involving only reusable resources. The main problem is that the conditions under which a reusable resource may be produced can depend in an arbitrarily complex way on the internal logic of processes. Given the operating system has to deal with an unknown and arbitrary collection of user processes, cannot see into the processes, and could not predict their actions even if it could see into them (remember the Halting Problem), there is no general way for the operating system to prevent, avoid, or even recognize the existence of a deadlock involving consumable resources. For this reason, dealing with consumable resource deadlocks is outside the scope of our present study of operating systems.

By the way, this difficulty with deadlock is arguably one of the disadvantages of message-based synchronization. By comparison, if we can solve a problem using simple mutual exclusion alone (that is, single-unit reusable resources) there are simple design techniques that guarantee there is no possibility of deadlock.

Questions:

What are examples of consumable resources?

Hardware interrupts, Unix signals, pthread_cond_signal calls, messages, and information in I/O buffers.


Deadlock Example with Reusable Resources

Process P1:
request(R1); ... request(R2); ... release(R2); ... release(R1);
  R1 critical section
  R2 critical section  

Process P2:
request(R2); ... request(R1); ... release(R1); ... release(R2);
  R1 critical section  
  R2 critical section


The pseudocode above shows two processes, each of which has nested critical sections for the same pair of resources. They request the resources in a different order, so there is a possibility of deadlock. Suppose P1 requests R1, and that is granted. Then P2 requests R2, and that is granted. Then P1 requests R2, and that cannot be granted because R2 is held by P2, so P1 blocks. Then P2 requests R1, and that cannot be granted because R1 is held by P1, so P2 blocks. The two processes are note deadlocked.


Resource Allocation Graph of a Deadlocked System


The figure shows the deadlock that is possible for the two pseudocode processes above, modeled by a (single-unit) resource allocation graph. This is the simplest of several graph models for resource allocation we will consider below. (The others involve multiunit and consumable resources.) There are two types of nodes: process nodes (show as squares) and resource nodes (show as circles). There are two types of edges: held-by edges (shown as solid arrows) go from a resource to the process that is holding it; wait-for edges (shown as dashed arrows) go from a process to the resource it is waiting for. Notice that there is a cycle in the graph. A cycle in the resource allocation graph is a necessary condition for deadlock (in every model). For single-unit resources (but not in the more general models) it is also a sufficient condition.


Deadlock Trajectory Diagram


The figure shows a trajectory diagram for the same pair of processes P1 and P2 as the pseudocode and resource allocation graph shown above.

A trajectory diagram is useful for visualizing how a system can get into a deadlock. We can draw such a diagram if we have just two processes and the steps in the execution of each process progress can be viewed as a simple sequence of steps (i.e., we don't have to consider condition control flow or looping). The combined state of the two processes corresponds to a point (X, Y) in the plane, where X is the state of one process and Y is the state of the other process. In our example, the progress of one process P1 is plotted on the Y (vertical) axis and the progress of process P2 is plotted on the X (horizontal) axis.

The pink shaded regions correspond to states that are inside the critical region for resource R1. The darker pink region covers the states in which both processes would need to hold R1, which would violate mutual exclusion, and so that region is forbidden.

The grey shaded regions correspond to states that are inside the critical region for resource R2. The darker grey region covers the states in which both processes would need to hold R2, which would violate mutual exclusion, and that region is forbidden.

The white dots correspond to states whose entry is under the control of the OS, and black dots represent states whose entry is under the control of the application. The system start state is shown at the lower left corner. In the start state, neither P1 nor P2 has started execution, so neither is holding any resources.

The green lines show two possible system trajectories. The solid green line shows a trajectory that leads to deadlock. It is deadlock because there is no way the system can make progress without entering one of the forbidden regions. Here, progress in process P1 means movement upward and progress in process P2 means movement toward the right.

Notice that the deadlock state is in the corner of a concave pocket that is formed by the intersection of the two forbidden regions. This pocket is a trap. Once we get into it we cannot get out without backing up. Since preemption is not allowed, the system cannot force a process to back up.

The problem here can be viewed as a board game in which the processes are playing against the operating system. The goal of the operating system is to not get into a deadlock state. The goal of the processes is to try to force the operating system in to a deadlock. The moves of the processes are resource requests and resource releases. The moves of the operating system are resource grants. The only choice of the operating system is how long it postpones granting a request.

The dashed green line shows an alternate trajectory, which enables both processes to complete without deadlock. The critical difference between the two trajectories is at the point where P1 requests R1. If the system grants the request immediately (which it can, because P2 is not using R1 at the time), ultimate deadlock is unavoidable. If the system postpones granting P1's request for R1, P2 is given time to release R2, and there will be no deadlock. A good resource manager can force the system along the safe trajectory, since the decision when to allocate a resource is up to the resource manager.


Example of Deadlock with Consumable Resources

P1    P2
... ...
receivefrom(P2, &M2); receivefrom(P1, &M1);
... ...
sendto(P2, M1); sendto(P1, M2);

Deadlock occurs if the receivefrom operation is blocking.


The pseudocode above shows how deadlock is possible with message passing. Process P1 and P2 each wait for a message from the other, and while they are waiting neither is able to send the message that the other is waiting for. This kind of deadlock is difficult to prevent unless one imposes some constraints (protocols) on the sending and receiving of messages.


Natural Questions about Deadlock

What are the fundamental causes of deadlock?

How can we deal with deadlock?


Now that we have looked at examples of some specific cases of deadlock, it is time to look at deadlock from a general point of view.


The 4 Necessary Conditions for Deadlock

  1. Exclusive access (mutual exclusion)
  2. Wait while holding (hold-and-wait)
  3. No preemption
  4. Circular wait


In order for there to be a deadlock, all of the above conditions must be true. You can observe that they are true in the examples we considered. Richard Holt, in a PhD dissertation published in the 1970's, showed that these conditions must apply in any deadlock. You can verify that informally, by looking at each condition and convincing yourself that if the condition is not true, there is no deadlock.


The 3 Approches/Strategies for Dealing with Deadlock


Questions:

How can we deal with deadlock?

What is the difference between prevention and avoidance?

What are examples of strategies for prevention?

There are only three approaches to dealing with deadlock. We can allow it to happen. In that case we need to be able to detect it when it has happened, and then deal with it. Alternatively, we can try to make sure deadlock does not happen. There are two general approaches to that. One is called prevention and the other is called avoidance. Prevention means always restricting what we do in such a way that there is no possibility of deadlock. Avoidance means waiting to impose restrictions until we are close to a deadlock, and then "steering around" the deadlock.

Experience has shown that some people have difficulty distinguishing prevention form avoidance. The technical way to understand the difference is in terms of safe and unsafe states, which we will get to a bit later. A more intuitive way is by analogy to other situations in which a person needs to deal with a potential hazard.

Analogies:

Suppose there is an open hole in the middle of a street. One way to prevent people from falling into the hole is to cover it. Another is to put up a barricade around it. A third avoidance strategy would be to never use that street. A way to avoid falling into it is to keep your look where you are going, and steer around the hole.

Suppose a person does want to get a dangerous communicable disease, such as AIDS. A prevention strategy would be to not engage in any of the activities by which the disease can be transmitted. An avoidance strategy would be to engage in the risky activities, but try not to engage in them with anyone who is a carrier of the disease.

In general, hazard prevention is safer and simpler than avoidance, but it is more limiting. In designing software systems, it is a good idea to design in a way that prevents hazards, to the extent the system requirements allow you to do so.

With deadlock, the prevention approach is to impose some restriction on how resources are used, that assures there can never be a deadlock. The avoidance approach is to impose no restrictions on resource usage, up to a point where it there is a danger the system will deadlock, and then guide the system around the deadlock, by postponing some allocations.


Deadlock Prevention Approaches

All of the four conditions are necessary for deadlock to occur.
Hence, by preventing any one of them we prevent deadlock.

  1. Exclusive access (mutual exclusion)
  2. Wait while holding (hold-and-wait)
  3. No preemption
  4. Circular wait

If we look at each of the four necessary conditions for deadlock, we can see how deadlock might be prevented by denying that condition. Whether one of these prevention strategies can apply to a given resource depends on the nature of the resource and how it is used. For example, ordered allocation works well for mutexes, but preemption of a mutex is not acceptable.

Questions:

What happens to a process if resources it is holding are preempted?

Is it safe for the process to just go on executing?

Short of killing the process that is preempted, how might this problem be addressed?


Cycle Implies Different Allocation Orders


For a hold-and-wait cycle to exist, one process must request a pair of resources in a different order than another. In the diagram, P1 requested R1 (which it is holding) and then requested R2, but P2 requested R2 (which it is holding) and then requested R1.

In general, if we number the resource nodes and walk around a cycle in the resource allocation graph, there must come a point where we go from a higher-numbered resource to a lower numbered resource. It follows that if we never request a resource while holding a higher-numbered resource, there can be no cycle.


Ordered Allocation

Process P1:
request(R1); ... request(R2); ... release(R2); ... release(R1);
  R1 critical section
  R2 critical section  

Process P2:
request(R1); ... request(R2); ... release(R2); ... release(R1);
  R1 critical section
  R2 critical section  


If all processes always request resources in a fixed order, there can be no hold-and-wait cycle.

This is usually not a burdensome restriction. It still allows us to have critical sections that involve more than one resource, and use simple lock and unlock operations on individual resources to implement them.


Trajectory Diagram for Ordered Allocation


The trajectory diagram shows that there are no "concave" regions formed by the intersectios of the forbidden regions, so there are no accesssible deadlock states.


Release Before Request

Process P1:
request(R1); ... release (R1); request(R2); ... release(R2);
 R1 critical section  R2 critical section

Process P2:
request(R2); ... release (R2); request(R1); ... release(R1);
 R2 critical section  R1 critical section


If processes always release one resource before requesting another, there can be no hold-and-wait

This is a tougher restriction. It rules out critical sections that involve more than one resource.


Request All at Once

Process P1:
request(R1, R2); ... release(R1, R2);
 R1 critical section 
 R2 critical section 

Process P2:
request(R1, R2); ... release(R1, R2);
 R1 critical section 
 R2 critical section 


If processes always request all resources they will need at once, there can be no hold-and-wait

This allows critical sections that involve more than one resource, but requires the operating system (or resource manager) to provide a way to atomically request several resources at once. Some batch and real-time operating systems support this, but it is not supported, for example, by the POSIX thread mutex API.


Deadlock Avoidance


Deadlock avoidance requires foreknowledge of the worst-case resource requests a process may make.

The two logical points at which a deadlock avoidance decision can be made are the point of process admission and the point of an incremental resource request.


Requirements for Deadlock Avoidance


Maximum resource requirement information may not be available, except for certain kinds of resources and certain restricted types of processes. The availability of this kind of information was typical of traditional batch processing systems, and is true for some real-time operating systems.

The kinds of resources this method may be applied to are also limited. It has been used for allocating tape drives to jobs in a batch-oriented operating system. Another example of an application is to the allocation of signal processors to a main-CPU process in a real-time signal processing system.


Banker's Algorithm

An example of a deadlock avoidance algorithm.

Please try out FSU applet for Banker's Algorithm (by Shishir Tiwari).


The banker's algorithm postpones deadlock avoidance decisions until the point of an incremental resource request, which if granted would take the system into an unsafe state.

The following slides contain an example from Stallings' text, which we walk through in class.

It is helpful to also go through some other examples, to make certain you understand the algorithm. One way to do this is using the applet to which link is given above.


System with Potential for Deadlock

Process P1:
request(R1); ... request(R2); ... release(R2); ... release(R1);
  R1 critical section
  R2 critical section  

Process P2:
request(R2); ... request(R1); ... release(R1); ... release(R2);
  R1 critical section  
  R2 critical section


We will review the two-process two-resource deadlock example given in pseudocode above, to see how deadlock could be avoided.


Trajectory to Unsafe State


This trajectory diagram is identical to the one given for the same example further above, except that one region of the diagram has been colored gold. That region contains the unsafe states of the system. Notice that the unsafe states are in a concave pocket, formed by the intersection of the two forbidden regions (the dark pink and dark grey regions). Once the system enters this pocket, it cannot get out without either backing out (violating nonpreemption) or entering a forbidden zone (violating mutual exclusion).

Notice that there is only one deadlocked state, in the upper right corner of the unsafe region. The other states in this region are unsafe, but not deadlocked.

The green trajectory in the diagram above corresponds to the following, which one many possible sequences of moves:

P2 requests R2
The OS grants the request of P2
P1 requests R1
The OS grants the request of P1
P2 requests R1
The OS cannot grant the request of P2, because R2 is locked.
P1 requests R2
The OS cannot grant the request of P1, because R1 is locked.
No further progress is possible, i.e. we have a deadlock.

The system made a fatal error when it granted P1's request for R1 while P2 was holding R2. This took the system into an unsafe state, from which the processes P1 and P2 were finally able to drive the system into deadlock.

Questions:

What is the difference between a safe state and a deadlock-free state?


Resource Allocation State Transition Diagram


Another way of looking at safety is in terms of a state transition diagram. The figure shows the states and transitions of the example system considered further above. The red state is the deadlock state. The shaded yellow states are unsafe states. The black and green states are safe states. The black transitions and states are under control of the processes, and the green transitions and states are under control of the operating system (or resource manager).

The property that makes the unsafe region usafe is that there are no transitions out of the unsafe region that are controlled by the operating system.

In this diagram, the system will eventually reach deadlock, once it enters an unsafe state, since all the transitions lead toward the deadlock state.

In this regard the example gives a false impression, that there can be no transitions out of an unsafe region, at all. That is because the example has no conditional branching in the code. In a more general example there might be transitions out of the unsafe region that are under control of the processes (but none under control of the operating system).


How Deadlock Is Avoided


Deadlock avoidance algorithms keep track of the system state, and avoid making transitions that lead from safe states to unsafe states.

The green path on the diagram above shows how the system can steer itself around unsafe states, for the same sequence of requests from P1 and P2 as in the deadlock example considered previously. The difference is that the system does not grant every request immediately, just because the resource is available. Instead, the system blocks P1 at its request for R1 until after P2 releases R2.

P2 requests R2
The OS grants the request of P2
P1 requests R1
The OS postpones granting the request of P1, because it would lead to an unsafe state.
P2 requests R1
The OS grants the request of P2
P2 releases R1
The OS grants the request of P1 for R1
P1 requests R2
The OS cannot grant the request of P1, because it R2 is still locked by P2.
P2 releases R2
The OS grants the request of P1 for R2
P1 releases R2
P1 releases R1


How do we know a state is safe?


Representing State as Set of Tables


The banker's algorithm represents the state of the system by a set of tables, which are shown above. This example comes from Stallings' text. In class, we walk through the example, step by step, explaining each step. Stallings gives the same explanation in his text, so it is not repeated here.


Determination of a Safe State: initial state


Determination of a Safe State: P2 runs to completion


Determination of a Safe State: P1 runs to completion


Determination of a Safe State: P3 runs to completion


An Unsafe State: initial state


An Unsafe State: P1 requests one unit each of R1 and R3


How to Detect Deadlock?


It would be a good idea here to jump ahead momentarily to look at example of graph reduction much further below. Some people may find it easier to grasp quickly than the example with tables.


Deadlock Detection with Tables

 R1R2R3R4R5
P1 0 1 0 0 1
P2 0 0 1 0 1
P3 0 0 0 0 1
P4 1 0 1 0 1
Request Matrix Q
 R1R2R3R4R5
P1 1 0 1 1 0
P2 1 1 0 0 0
P3 0 0 0 1 0
P4 0 0 0 0 0
Allocation Matrix A
R1R2R3R4R5
2 1 1 2 1
Resource Vector
R1R2R3R4R5
0 0 0 0 1
Available Vector

Mark P4.
Select P3 because it is unblocked.


Deadlock Detection with Tables (Step 2)

 R1R2R3R4R5
P1 0 1 0 0 1
P2 0 0 1 0 1
P3 0 0 0 0 1
P4 1 0 1 0 1
Request Matrix Q
 R1R2R3R4R5
P1 1 0 1 1 0
P2 1 1 0 0 0
P3 0 0 0 0 0
P4 0 0 0 0 0
Allocation Matrix A
R1R2R3R4R5
2 1 1 2 1
Resource Vector
R1R2R3R4R5
0 0 0 1 1
Available Vector

Complete and mark P3.
No other process is unblocked, so we have detected a deadlock.


What to do when Deadlock is Detected?


How to Choose which Process to Abort?


Graph Models of System Resource Allocation States


Wait-For Graphs (WFG)

Nodes correspond to processes (only).


There is an edge from process P1 to process P2 iff P1 is blocked waiting for P2 to release some resource.

This type of graph models the wait-for relationships among processes, at a high level of abstraction. A cycle in this graph is a necessary condition for deadlock. Whether existence of a cycle is sufficient for deadlock depends on the kinds of wait-for relationships among processes we allow.

The following forms of graphs provide more detail, showing enough about the specifics of the wait-for relation that we can talk about deadlock detection.


Single-Unit Resource Allocation Graphs


In this model there are two types of nodes, corresponding to processes and resources, respectively. Each resource is a single-unit reusable resource.

There is a request edge from process P to resource R iff P is blocked waiting for an allocation of R.

There is an assignment edge from resource R to process P iff P is holding an allocation of R.

In this model, the existence of a cycle is a necessary and sufficient condition for deadlock.

How do we model multi-unit resources?


Multiunit Resource Allocation Graphs


In this form of graph the resource nodes are complex. There is an outer-level node, which represent a collection of resources, and one or more nodes within, which represent units of that resource. A request is represented by an edge from a process to a resource collection. An allocation is represented by an edge from a unit of a resource to a process.

The graph shown in the figure is a multi-unit reusable resource graph. There are three processes (P1, P2, and P3) and three collections of resources (R1, R2, and R3). Resource collections R2, and R3 each have only a single unit. Resource collection R1 has two units. Process P1 is holding one unit of resource R1 and one unit of resource R3 and waiting on a request for the second unit of resource R1. Process P2 is holding one unit of resource R1 and is waiting on a request R2. Process P3 is holding one unit of resource R2 and is waiting on a request R2. This is a deadlock, since every process is blocked on a resource request, and none of the requests can be satisfied.

The example can be modified slightly and used to show that the existence of a cycle in a multiunit resource graph is not a sufficient condition for deadlock. Just remove the edge that goes from P1 to R1, i.e., suppose process P1 has not (yet) made its request for a second unit of R1. There would still be a cycle in the graph, but since P1 would not be blocked, there would be no deadlock.


Advanced Topics

Advanced Topics

We do not expect to go beyond this point in class, and will not cover anything beyond this point on the examination.


General Resource System

A system state consists of:


The general resource graph adds the capability of modeling reusable resources, such as messages. This allows us to model the kinds of deadlock that arise in message-passing systems.


General Resource Graph


Bipartite directed graph

Questions:

What is a bipartite graph?

Can you give examples of graphs, operations on graphs, deadlock, and safe/unsafe states, in terms of graphs?


Request Models


Questions:

What is a sufficient condition for deadlock in each of these models?


Deadlock Conditions in WFG

Cycle is always a necessary condition for deadlock, but it is not sufficient in all models.


Insufficiency of Cycle with Multi-unit Resources

Pj is a producer of Ri

The notation #(Ri,Pj) denotes the number of units of resource Ri that are currently held by process Pj, and #(Ri, *) denotes the total number of units of resource Ri that are currently allocated to all the processes in the system.

Questions:

What is the relationship between a system state and a GRG?

Can you describe an algorithm for detecting a deadlock on a GRG?


Operations on General Resource Graph


Graph Reduction

Models a most favorable possible (optimistic) continuation of execution from the current system state.

while there is an unblocked process Pi:

  1. For each request edge (Pi, Rj), remove the request edge and decrement the corresponding count rj. This corresponds to satisfying the request.
  2. For each allocation edge (Rj, Pi), remove the allocation edge and increment the corresponding count rj. This corresponds to releasing the allocation.
  3. For each consumable resource production edge (Pi, Rj), remove the producer edge from the graph, and set rj to infinity. This corresponds to assuming Pi produces as many units of Rj as may be required by other processes.

There is deadlock iff the remaining graph is non-null


Example: the state in question

P1 is not blocked.


Example: after completion of P1

P3 is not blocked.


Example: after completion of P3

P2 is not blocked.


Example: after completion of P2

All processes have completed.


For systems with consumable resources, the final outcome depends on the sequence in which the processes have been reduced. Hence, we must try all possible sequences of reductions. That could be prohibitively expensive. However, for systems with only reusable resources one can prove that if a graph can be reduced completely, the order in which the reductions are performed does not matter.


Practicality of GRG Reduction

Note:We are just talking about graphs that have consumable resources here. Graph reduction is practical for reusable resource graphs.


Expedient States

A state is expedient iff all processes having outstanding requests are blocked, i.e. all grantable requests have been granted.

Note: A system must eventually get to one of these states if it is deadlocked.

Theorem In a GRG,

This still leaves a gap. That is, the existence of a cycle is necessary but not sufficient, and the existence of a knot is sufficient but not necessary.


© 2002 T. P. Baker & Florida State University. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means without written permission. (Last updated by $Author: cop4610 $ on $Date: 2002/09/26 20:06:06 $.)