Spanner's Architecture
Let's look into the architecture of Spanner in detail.
Components of Spanner
A Spanner deployment is called a universe.
Universe
Universe consists of a set of zones, which are the units of administrative deployment, physical isolation, and replication (e.g. datacenters).
Each zone has a zone manager and hundreds to several thousand spanservers.
Zone manager
The zone manager assigns data to spanservers.
Spanservers
Spanservers read/write requests from clients and store data.
Location proxies
The per-zone location proxies are used by clients to locate the spanservers that serve a specific portion of data.
Universe manager
The universe manager displays status information about all the zones for troubleshooting.
Placement driver
The placement driver handles the automated movement of data across zones, e.g., for load balancing reasons.
Note: In the original research paper describing Spanner, the authors use “zone master” and “universe master.” We will use the terms “zone manager” and “universe manager” to refer to the same things.
The following illustration shows the architecture of a Spanner:
Explanation
Each spanserver can manage multiple splits, and each split is replicated across multiple zones for availability, durability, and performance.
Note: Each split is stored in a distributed file system, called Colossus, the successor of GFS, which already provides byte-level replication. However, Spanner adds another replication level to provide the additional benefits of data availability and geographic locality.
All the replicas of a split form a Paxos group as shown in the above illustration.
Leader and followers
One of the replicas is voted as the leader and is responsible for receiving incoming write requests and replicating them to the replicas of the group via a Paxos round.
The rest of the replicas are followers and can serve some kinds of read requests. This is shown in the following illustration:
Benefits of Spanner
Spanner provides the following benefits by using:
- long-lived leaders with time-based leader leases, which are renewed by default every 10 seconds
- pessimistic concurrency control to ensure proper isolation between concurrent transactions, specifically two-phase locking.
The leader of each replica group maintains a lock table that maps ranges of keys to lock states for this purpose.
Note that in practice, these locks are also replicated in the replicas of the group to cover against the failures of the leader.
Distributed transaction support
Spanner also provides support for distributed transactions that involve multiple splits that potentially belong to different replica groups.
This is achieved via a two-phase commit across the involved replica groups. As a result, each group leader also implements a transaction manager to take part in the two-phase commit. The leaders of each group are called participant leaders, and the follower replicas of each one of those groups are referred to as participant followers. More specifically, one of these groups is chosen as the coordinator for the two-phase commit protocol. The replicas of this group are referred to as coordinator leader and followers, respectively.
Get hands-on with 1400+ tech skills courses.