Version Vectors

Let's find the similarities and differences between version vectors and vector clocks. We will also look into the details of the version vectors and issues with the vector clocks.

Comparing version vectors with vector clocks

Version vectorsD. S. Parker et al., “Detection of mutual inconsistency in distributed systems,” IEEE Transactions on Software Engineering, Volume, 9 Issue 3, pages 240-247, 1983. are a mechanism that is very similar to vector clocks. The data structure used by version vectors and the associated update rules are very similar to those used by vector clocks. However, version vectors are used for slightly different purposes.

As explained previously, vector clocks are used to maintain a logical form of time, which can then be used to identify when events are happening, especially in comparison to other events.

On the other hand, version vectors are better suited for applications that store data, where every data item is tagged with a version vector. In this way, data can potentially be updated in multiple parts of the system concurrently (e.g., when there is a network partition). So, the version vectors from the resulting data items can help us identify those items that can be reconciled automatically and those that require conflict resolution.

Version vectors maintain a state identical to that in a vector clock, and contain one integer entry per node in the system.

Rules for the protocol

The update rules for version vectors are slightly different: nodes can experience both local updates (e.g., a write applied at a server) or can synchronize with another node (e.g., when recovering from a network partition).

  • Initially, all vectors have all their elements set to zero.
  • Each time a node experiences a local update event, it increments its own counter in the vector by one.
  • Each time two nodes aa and bb synchronize, they both set the elements in their vector to the maximum of the elements across both vectors Va[x]=Vb[x]=max(Va[x],Vb[x])V_a[x] = V_b[x] = max(V_a[x], V_b[x]). After synchronization, both nodes will have the same vectors.

Furthermore, depending on whether the initial vectors were causally related or not, one of the associated items will supersede the other, or some conflict resolution logic will be executed to maintain one entry associated with the new vector.

Working of version vectors

Version vectors are mostly beneficial in systems that act as datastores, so the nodes in the system will belong to two basic categories: the server or replica nodes that store the data and receive read/write operations and the client nodes that read data from the replica nodes and send update instructions.

In many cases, clients might not even be part of our systems, such as in scenarios where our system receives operations directly from customers’ web browsers. As a result, it is better to avoid a significant amount of logic and storage overheads in the client nodes.

The version vector mechanism allows this in the following way:

  • One entry is maintained for every node (both replica/server and client nodes)

However, the client nodes can be stateless, which means they do not store the version vectors. Instead, they receive a version vector as part of every read operation, and they provide this version vector when executing an update operation back to the corresponding replica node.

According to Perguica et al.N. M. Perguica, C. Baquero, P. S. Almeida, V. Fonte, and G. Ri- cardo, “Dotted Version Vectors: Logical Clocks for Optimistic Replication,” arXiv:1011.5808, 2010., this is only possible in an environment where client nodes are not supposed to experience any local events but only interact with server nodes via read/write operations. It also requires us to read our writes semantics (i.e., obtained via read/write quorums) so that each read returns the most recent update to a client.

This vector is referred to as “context”, and the replica node uses it to update its version vector accordingly.

Version vectors with per-client entries

The following illustration contains an example execution in a distributed system using version vectors with one entry for each node.

For simplicity, the example assumes there is only one item, so all clients operate on the same item. It can easily be extended to cover cases with multiple items, where each item has a separate version vector and the clients have to provide an identifier for the item to be accessed.

Each read operation returns the current value with the corresponding version vector.

Each write operation has three arguments:

  • The version vector that is given as context
  • The identifier of the client node
  • The value to be written

The replica node is responsible for incrementing the appropriate entry in the vector (depending on the identifier of the client) and then persisting the new value.

Note that this new value can either be persisted alongside other values that were written concurrently, or overwrite values that causally precede it.

In the first case, multiple values will be returned in subsequent reads and the following write operations will reconcile them and persist a single value.

In the above example, if node DD performed a read from node BB (which will return both values VV and WW and their version vectors) and then attempted to write the value ZZ, then the update will be of the form PUT({(C,1),(D,1)},D,Z)PUT(\lbrace(C,1), (D,1)\rbrace, D, Z). The replica node will calculate the new version vector {(C,1),(D,2)}\lbrace(C,1), (D,2)\rbrace and will identify it supersedes both existing version vectors {(C,1)}and{(D,1)}\lbrace(C,1)\rbrace and \lbrace(D,1)\rbrace. So it will replace values VV and WW, both with ZZ.

The approach of including entries in the vector clock for all client nodes is safe. It can successfully identify when two different versions have been written concurrently, or if one of them causally precedes the other one and can be discarded.

Limitation of vector clocks with per-client entries

The main limitation of the vector clocks is that their size does not scale nicely.

In distributed systems that are used as distributed datastores, the number of clients tends to be a lot bigger than the number of server nodes by two or three orders of magnitude. For instance, in many cases, each item is replicated in three different servers, while thousands of clients access this item.

Note that even in cases where the clients of the systems are a few application servers, a separate entry in the vector clock needs to be maintained for each server that executes operations concurrently from multiple threads.

As a result, this approach requires a significant amount of storage.

Coping with the limitation

Ideally, we want the size of the vector clocks to scale with the number of server nodes instead of the number of clients.

Version vectors with per-server entries

Could we remove the client entries from the vector clocks and let the servers increment their own entries when performing the updates on behalf of the clients?

Unfortunately, no. If we did this, the system would not be able to detect that some operations were performed concurrently, and would discard values that should have been preserved.

Issues while using vector clocks with per-server entries

The following illustration shows the issues with this approach.

As we can see, the first write operations performed by client nodes CC and DD are concurrent. However, server node BB would not be able to identify that. Node BB would consider the version vector of the second update {(B,2)}\lbrace(B,2)\rbrace to supersede {(B,1)}\lbrace(B,1)\rbrace, discarding the value VV and replace it with the value WW.

There is a technique that makes it possible to successfully identify concurrent versions and also allows the version vectors to scale with the number of servers. This is called dotted version vectors, which we will learn about in the next lesson.

Get hands-on with 1400+ tech skills courses.