Distributed Transactions via Consensus

The introduction of this chapter mentioned that the consensus problem is very similar to the distributed transactions problem.

However, after studying the Paxos algorithm, one might think there seems to be a fundamental conflict between distributed transactions and the way Paxos solves the consensus problem.

The core characteristic of distributed transactions

The core characteristic of distributed transactions is atomicity. Either the relevant update has to be performed in all the nodes, or it should not be performed in any of them.

Difference between transaction problem and consensus problem

However, the Paxos algorithm relies on just a majority quorum to decide on a value. According to HadzilacosV. Hadzilacos, “On the Relationship between the Atomic Commitment and Consensus Problems,” Fault-Tolerant Distributed Computing, November 1990, pages 201–208, 1990., “Indeed, the problem of distributed transactions, known as atomic commit, and the consensus problem might be closely related. Still, they are not equivalent”.

  • The consensus problem mandates that every non-faulty node must reach the same decision, while the atomic commit problem requires that all the nodes (faulty or not) must reach the same decision.

  • The atomic commit problem imposes stricter relationships between votes or proposals and the final decision than the consensus problem.

  • In consensus, the only requirement is that the value that is agreed must have been proposed by at least one of the nodes. In atomic commit, a decision can be positive only if all the votes were positive. The decision is also required to be positive if all votes are positive and there are no failures.

As a result of this difference, one might think that the Paxos algorithm does not have anything to offer in the problem space of distributed transactions. This is not true, and in this lesson, we will try to illustrate what Paxos (and any other consensus algorithm) has to offer.

Biggest contribution of a consensus algorithm

One useful contribution of a consensus algorithm is that it communicates the resource managers’ results back to the transaction manager, which requires successful communication for all of them and not just a majority.

However, its true value is storing and transmitting the transaction’s result back to the resource managers in a fault-tolerant way so that the failure of a single node (the transaction manager) cannot block the system.

Achieving the contribution

Indeed, there is a very simple way to achieve this goal in the existing 2-phase commit (2PC) protocol, by leveraging a consensus algorithm.

Assuming we make use of Paxos as a consensus algorithm, we could have the transaction manager start a new Paxos instance, proposing a value for the result of the transaction, instead of just storing the result locally before sending it back to the resource managers.

The proposal value would be either commit or abort, depending on the previous results of each one of the resource managers. This adjustment on its own would make the 2-phase commit protocol resilient against failures of the transaction manager since another node could take the role of the transaction manager and complete the protocol. That node would have to read the result of the transaction from any existing Paxos instance. If there’s no decision, that node would be free to make an abort proposal.

Additional message round

The above method is simple and elegant, but it would require adding one more messaging round to the 2-phase commit protocol.

It’s actually possible to remove the additional message round, trading off some simplicity for increased performance.

Removing additional message round

We could remove additional message round by essentially “weaving” several instances of Paxos in the plain 2-phase commit protocol, practically obviating the need for a transaction manager completely.

More specifically, the resource managers would have to send their response to the first phase to a set of acceptors, instead of sending it to the transaction manager. This creates a separate Paxos instance for every resource manager involved in the transaction.

Similarly, the acceptors could propagate the chosen values to the resource managers directly, instead of doing so indirectly via the transaction manager.

The resource managers would be responsible for checking that all the Paxos instances from the other resource managers had a positive result (corresponding to the first phase of 2PC) to commit the transaction.

A paper titled Consensus on transaction commitJ. Gray and L. Lamport, “Consensus on Transaction Commit,” ACM Transactions on Database Systems (TODS), Volume 31 Issue 1, March 2006, 2006. examines this relationship between distributed transactions and consensus and explains this approach in much greater detail, referred to as Paxos commit. This paper also demonstrates why 2-phase commit is essentially a special case of Paxos commit with zero tolerance of node failures (f=0f = 0).

Get hands-on with 1400+ tech skills courses.