# Topics for today

• Review
• The Birman-Schiper-Stephenson protocol for causal ordering of messages
• Chandy-Lamport global state recording algorithm
• Huang's termination algorithm

# Review

• How is the happened before relation defined?
• How do we maintain Lamport's logical clock (rules?)?
• How do we maintain vector clocks? What do they offer?
• What is the causal ordering of messages?

# Global State Problem

How to collect or record a coherent (consistent) snapshot of the state of an entire distributed system?

One application of this problem is in implementing a breakpoint for debugging a distributed application. That is, suppose we want to suspend execution of all the processes in a way that we can examine what each of them is doing, and later resume them.

# Banking Example

For consistency, we need to take into account the messages that are in transit.

Let n be the number of messages sent by A along a channel before A's state is recorded, n' be the number of messages sent by A along the channel before the channel's state is recorded, a consistent global state requires n = n'

In the global state we want to view as in-transit all the messages sent along a channel before the sender's state was recorded that were not yet received when the receivers state was recorded.

A global state includes snapshots of the states of all the channels along with the states of all the sites. Since the channels are passive, the snapshots of the channels must be computed by the sites to which they are connected.

# Global State: Notation

For a site Si, its local state, LSi, at a given time is defined by the local context of the application.

send(mi,j) is the event of Si sending message mi,j to Sj

rec(mi,j) is the event of Sj receiving message mi,j from Si

time(x) is the time at which state x was recorded

time(send(m)) is the time at which message m was sent

send(mi,j) Î LSi iff time(send(mi,j)) < time(LSi)

rec(mi,j) Î LSi iff time(rec(mi,j)) < time(LSj)

GS = { LS1, LS2, ¼LSn}

# Global State: Definitions

transit(LSi, LSj) = {mi,j | send(mi,j) Î LSi and rec(mi,j) Ï LSj}

inconsistent(LSi, LSj) = {mi,j | send(mi,j) Ï LSi and rec(mi,j) Î LSj}

GS is consistent iff it is not inconsistent.
GS is strongly consistent if it consistent and transitless.

In a consistent global state, causes are recorded if the corresponding effects are recorded.

In a strongly consistent global state, causes are recorded iff the corresonding effects are recorded.

# Example

{LS12, LS23, LS33} is consistent.

{LS11, LS22, LS32} is inconsistent.

{LS11, LS21, LS31} is strongly consistent.

# Chandy-Lamport GS Recording Algorithm

Uses a marker message to initiate taking the snapshot, and to separate messages within each channel.

# Chandy-Lamport GS Recording Alg: Sending Rule

• P records its local state
• For each outgoing channel C from P on which a marker has not already been sent, P sends a marker along C before P sends further messages along C

# Chandy-Lamport Receiving Rule

When a marker is received on channel C:
• if Q has not recorded its state then
• record the state of C as an empty sequence
• follow the Marker Sending Rule
• else
• record the state of C as the sequence of messages received along C after Q's state was recorded and before Q received the marker along C

# Chandy-Lamport Example

Suppose site S0 sends markers to sites \$ S1and S_2\$, and site S2, with account B, receives the marker first, checkpointing the valuer of B in a local snapshot. The request message "[B+=\$50]" arrives later, before the marker on channel C1, and so is recorded as part of the state of that channel.

How does this algorithm get the data back to the process that requested the snapshot? How does the algorithm terminate?

# Usefulness of Recorded Global State

• limited to detecting stable properties of a system
Why?
• examples of stable properties:
• termination of a computation

# Termination Detection

• Needed in many distributed algorithms: deadlock detection, deadlock resolution.
• An example of using the consistent global view.

## System model:

• every process is either active or idle
• only active processes send messages
• an active process may become idle at any time
• an idle process can only become active by receiving a computation message
• a computation has terminated iff
• all process are idle
• there are no messages in transit

# Huang's Termination Algorithm

• one process is the controlling agent
• computation involves exchanges of messages
• each process has a weight between 0 and 1
• when a process sends a message, it splits its weight between itself and the message
• B(DW) = computation messages with weight DW
• C(DW) = control message with weight DW
• invariant: the sum of all process weights is 1
• initially all processes are idle,
controlling agent has weight 1, and others have weight 0
• sending a message splits weight
between sender and receiver
• computation starts when controlling agent sends a message
• computation terminates when controlling agent weight = 1 again

This algorithm views termination as a flow analysis problem.

# Details

• process with weight W sends computation message to P
• split W = W1 + W2
• set W to W1 and send B(W2) to P
• process P with weight W receives B(DW)
• set W to W + DW
• if P was idle, P becomes active
• when a process becomes idle
• send C(W) to the controlling agent
• set W to 0
• when the controlling agent receives C(DW)
• set W to W + DW
• if W = 1 the computation has terminated

# Things for you to do

• Review Chapter 5 and understand the Schiper-Eggli-Sandoz protocol. A quiz will be given in the next class.