Solving the Distributed Snapshot Problem

Let's explore a seminal algorithm used for capturing distributed snapshots.

We'll cover the following

Chandy-Lamport algorithm

Idea
Working

Chandy-Lamport algorithm

The Chandy Lamport algorithm solves the consistent snapshot problem in a distributed system.

Idea

The algorithm is based on the following main idea: a marker message is sent between nodes using the available communication channels that represent an instruction to a node to record a snapshot of the current state.

Working

The algorithm works as follows:

The node that initiates the protocol records its state and then sends a marker message to all the outbound channels.

Importantly, the marker is sent after the node records its state and before any further messages are sent to the channels.

When a node receives a marker message, its behaviour depends on whether the node has already recorded its state (while emitting the mark previously) or not.
- If the node has not recorded its state, it records its state, and then it records the state of the channel $c$ the marker was received from as an empty sequence. It then sends the marker to all the outbound channels.
- If the node has recorded its state, it records the state of the channel the marker was received from as the sequence of messages received from $c$ after the node’s state was recorded and before the node received the marker from $c$ .

Get hands-on with 1400+ tech skills courses.

Before Getting Started

Introduction to Distributed Systems

Basic Concepts and Theorems

Distributed Transactions

Achieving Isolation

Achieving Atomicity

Concluding Distributed Transactions

Consensus

Time

Order

Networking

Security

Security Protocols

From Theory to Practice

Case Study 1: Distributed File Systems

Case Study 2: Distributed Coordination Service

Case Study 3: Distributed Data Stores

Case Study 4: Distributed Messaging System

Case Study 5: Distributed Cluster Management

Case Study 6: Distributed Ledger

Case Study 7: Distributed Data Processing Systems

Practices & Patterns

Communication Patterns

Coordination Patterns

Data Synchronization

Shared-nothing Architectures

Distributed Locking

Compatibility Patterns

Dealing with Failure

Distributed Tracing

Concluding this Course

Solving the Distributed Snapshot Problem

Chandy-Lamport algorithm

Idea

Working