Achieving Full Serializable Snapshot Isolation

Let's investigate how a serializable snapshot isolation algorithm helps to achieve full serializability preventing all anomalies.

Research in the field of concurrency control mechanism for achieving snapshot isolation level has resulted in an improved algorithm, called serializable snapshot isolation (SSI)M. J. Cahill, U. Rohm, and A. D. Fekete, “Serializable Isolation for Snapshot Databases,” Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 2008., that can provide full serializability and has been integrated into commercial, widely used databases by Ports et al.D. R. K. Ports and K. Grittner, “Serializable Snapshot Isolation in PostgreSQL,” Proceedings of the VLDB Endowment, Volume 5 Issue 12, August 2012, 2012.

This algorithm is still optimistic and just adds some extensions on top of what we described in the previous lesson.

Serializable snapshot isolation (SSI)

The solution mechanics are based on a key principle of previous research that showed that all the non-serializable executions under snapshot isolation share a common characteristic.

Statement

SSI states this: in the precedence graph of any non-serializable execution, there are two rw-dependency edges that form consecutive edges in a cycle. These involve two transactions that have been active concurrently, as shown in the following illustration.

T0T_0, T1T_1 and TNT_N are three concurrent transactions. TNT_N could also be a different number (e.g., T3T_3). TNT_N implies that this circle/chain could be larger and involve more transactions between T1T_1 and TNT_N.

What’s important here is the path from TNT_N to T1T_1 that contains two consecutive transactions with a rw-dependency.

A rw-dependency is a data dependency between transactions T1T_1 and T0T_0, where T0T_0 reads a version of item x and T1T_1 produces a version of item x that is later in the version order than the version read by T0T_0.

Approach

The SSI approach detects these cases and breaks the cycle when they are about to happen. It prevents them from being formed by aborting one of the involved transactions. To do this, the SSI performs the following steps:

  • It keeps track of the incoming and outgoing rw-dependency edges of each transaction.

  • If there is a transaction that has both incoming and outgoing edges, the algorithm aborts one of the transactions and retries it later.

Note this can lead to aborts that are false positives, since the algorithm does not check whether there is a cycle. This is done intentionally to avoid the computational costs associated with tracking cycles.

So, it is sufficient to maintain two Boolean flags per transaction T.inConflict and T.outConflict, that denote whether there is an incoming and outgoing rw-dependency edge. These flags can be maintained in the following way:

  • When a transaction TT is performing a read, it is able to detect whether there is a version of the same item that was written after the transaction started, e.g., by another transaction UU. This would imply a rw-dependency edge, so the algorithm can update T.outConflict and U.inConflict to true.

However, this will not detect cases where the write happens after the read. The algorithm uses a different mechanism to detect these cases, too.

  • Every transaction creates a read lock, called SIREAD lock, when it performs a read. As a result, when a transaction performs a write, it can read the existing SIREAD locks and detect concurrent transactions that have previously read the same item. It thus updates the same Boolean flags accordingly.

Note that these are a softer form of locks, since they do not block other transactions from operating, but exist mainly to signal data dependencies between them. This means the algorithm preserves its optimistic nature.

Preventing write-skew anomaly using SSI

The following illustration shows how this approach would prevent the write skew anomaly we have seen in the previous lesson.

When transaction T2T_2 executes the write operation, it checks for existing SIREAD locks on item I1I_1. T1T_1 holds such a lock, so transaction T2T_2 updates its inConflict flag to true. If both the inConflict and the outConflict flags for T2T_2 are true at this moment, this transaction is aborted.

For brevity, this explanation omitted some details of SSI. For further information, we can have a look at the related papers by Cahill et al.M. J. Cahill, U. Rohm, and A. D. Fekete, “Serializable Isolation for Snapshot Databases,” Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 2008. and Ports et al.D. R. K. Ports and K. Grittner, “Serializable Snapshot Isolation in PostgreSQL,” Proceedings of the VLDB Endowment, Volume 5 Issue 12, August 2012, 2012.

Get hands-on with 1400+ tech skills courses.