Vector Clocks

Let's inspect what vector clocks are and how they work and satisfy strong clock condition. Let's also look at the main limitation of Lamport clocks.

Limitation of Lamport clocks

The main limitation of Lamport clocks is that they do not satisfy the strong clock condition. This means they cannot be used to infer causal relationships between events.

The underlying reason for this is that both the local and the global logical clocks for each node are flattened into a single number, which does not provide all the necessary information to track causal relationships.

So we need to maintain a set of all events that causally precede each event. This is known as a causal historyS. Reinhard and M. Friedemann, “Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail,” Distributed Computing, Volume 7 Issue 3, March 1994, 1994.. For instance, the following illustration shows the causal history of E7E_7 is {E1,E2,E3,E4,E5}\lbrace E_1, E_2, E_3, E_4, E_5\rbrace. We also need to store the causal history of each event as efficiently as possible by using a compact data structure. According to Colin J.F. Colin J., “Timestamps in Message-Passing Systems That Preserve the Partial Ordering,” Proceedings of the 11th Australian Computer Science Conference (ACSC’88), pp. 56–66, 1988. and FriedemannM. Friedemann, “Virtual Time and Global States of Distributed Systems,” Parallel and Distributed Algorithms, 1988., a vector clock is an example of such a data structure.

Defining vector clock

A vector clock is another type of logical clock, where the clock data structure for each node is a vector of NN counters [c0,c1,..,cN][c^0, c^1, .., c^N], where NN is the number of nodes in the system. For the clock of the ithi^{th} node [ci0,ci1,...,ciN][{c_{i}}^0, {c_{i}}^1, ..., {c_{i}}^N]:

  • The ithi^{th} element of the clock cii{c_{i}}^i represents the local logical clock of the node
  • The remaining elements of the clock [ci0,...,cii1,cii+1,...,ciN][{c_{i}}^0, ..., {c_{i}}^{i-1}, {c_{i}}^{i+1},..., {c_{i}}^N] together form the global logical clock of the node

Rules of the protocol

The rules of the protocol are the following:

  • (R1) Before executing an event (send, receive, or local), a node increments the counter of its logical clock by one: Cii=Cii+1{C_{i}}^i = {C_{i}}^i + 1.
  • (R2) Every sent message piggybacks the clock value of its sender at sending time. When the ithi^{th} node receives a message with a vector [cj0,...,cjN][{c_{j}}^0, ..., {c_{j}}^N] from the jthj^{th} node, it executes the following actions:
    • Executes R1
    • Updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received message: Cik=max(cik,cjk){C_{i}}^k = max({c_{i}}^k, {c_{j}}^k) for every kk in [0,N][0, N]
    • Delivers the message

Strong clock condition

Vector clock satisfies the strong clock condition. This means that if the relationship C​i<C​j​​ holds for two events EiE_i, EjE_j with timestamps CiC_i, then EiEjE_i\to E_j.’

The only missing thing is how comparing vector clocks with each other, which is done in the following way:

  • For two clocks Ci=[ci0,...,ciN]C_i = [{c_{i}}^0, ..., {c_{i}}^N] and Cj=[cj0,...,cjN]C_j = [{c_{j}}^0, ..., {c_{j}}^N] Ci<CjC_i < C_j iff all the elements of the clock CiC_i are less than or equal to all the corresponding elements of clock CjC_j (cikcjk({c_{i}}^k \leq {c_{j}}^k for all k)k) and there is at least one element of CiC_i that is strictly smaller than the corresponding element of CjC_j (cil<cjl({c_{i}}^l < {c_{j}}^l for at least one ll in [0,N])[0, N]).

Working of vector clocks

The following illustration contains the same distributed execution displayed in the Lamport clocks lesson’s illustration. This time, we are using vector clocks.

Each node in the system maintains a vector clock, where the first element corresponds to the time in node AA, the second to the time in node BB, and the third and last element to the time in node CC.

We can spend some time again verifying that the clock values have been assigned just by following the rules described above.

This time, we can see that the clock of A1A_1 is smaller than the clock of B1B_1, while also A1B1A_1 \to B_1.

What’s more important, though, is that we can detect events that are not causally related. For instance, B2C2B_2 \parallel C_2 and we can see that the timestamp of B2B_2 is neither smaller nor larger than the timestamp of C2C_2. This means that we can consider these events concurrent, and they could have happened in any order.

Usage of vector clocks

Vector clocks can be used in cases that benefit from the capability to detect whether two events are causally related or concurrent. They also allow the different nodes of the system to make progress independently and efficiently without synchronization and coordination bottlenecks.

Points to be noted

We mentioned previously that in a distributed system of nn nodes, each vector clock is composed of nn elements. It’s important to clarify that every process that consists of a source of concurrency needs to be considered as a node of the system in the context of the vector clocks. This means an entry for this process should be dedicated in each clock.

For example, if our application consists of three servers and two clients, each vector clock should contain five entries. Otherwise, we risk not being able to identify concurrent and conflicting operations, and treating them as causally related instead.

See why vector clocks are hard.

It has been formally proven by Charron-Bost et al.B. Charron-Bost, “Concerning the Size of Logical Clocks in Distributed Systems,” Information Processing Letters, Volume 39 Issue 1, July 12, 1991, 1991. that “the size of a vector clock must be at least nn for a system consisting of nn nodes to fully capture causality”. This means that vector clocks require a significant amount of storage in cases where the number of all the participating nodes is large. This can be the case for some types of systems nowadays, such as web applications where every browser is considered a client of the system.

Get hands-on with 1400+ tech skills courses.