Quorum-Based Commit Protocol
Let's probe how the quorum-based protocol solves the problem of the 3-phase commit protocol.
The problem with 3-phase commit protocol
As we observed in the previous lesson, the main issue with the 3PC protocol occurs at the end of the second phase, where a potential network partition can bring the system to an inconsistent state.
This can happen when participants attempt to unblock the protocol by taking the lead without having a picture of the overall system, resulting in a split-brain situation.
Coping with the problem
Ideally, we would like to cope with this network partition without compromising the safety of the protocol. This can be done using a concept we have already learned in this course: a quorum.
This approach is followed by the
This protocol is significantly more complex when compared to the other two protocols we described previously, so we should study the original paper carefully to examine all the possible edge cases. However, we will attempt to give a high-level overview of the protocol in this section.
As we mentioned before, this protocol leverages the concept of a quorum to ensure that different sides of a partition do not arrive at conflicting results.
The protocol establishes the concept of a commit quorum and an abort quorum .
A node can proceed with committing only if a commit quorum has been formed, while a node can proceed with aborting only if an abort quorum has been formed.
The values of the abort and commit quorums have to be selected so that the following property holds:
is the total number of participants of the transaction.
Based on the fact that a node can be in only one of the two quorums, it’s impossible for both quorums to be formed at two different sides of the partition and lead to conflicting results.
Sub-protocols in quorum-based commit protocol
The quorum-based commit protocol comprises three different sub-protocols used in different cases:
- The commit protocol, which is used when a new transaction starts
- The termination protocol, which is used when there is a network partition
- The merge protocol, which is used when the system recovers from a network partition
Commit protocol
This protocol is very similar to the 3PC protocol. The only difference is that the coordinator waits for number of acknowledgments at the end of the third phase to proceed with committing the transaction.
If there is a network partition at this stage, the coordinator can be rendered unable to complete the transaction. In this case, the participants on each side of a partition will investigate whether they can complete the transaction, using the following protocol.
Termination protocol
Initially, a (surrogate) coordinator will be selected from amongst the participants with a leader election.
Note that it’s irrelevant which leader election algorithm is used. Even if multiple leaders are elected, the correctness of the protocol is not violated.
The elected coordinator queries the nodes of the partition for their status.
If there is at least one participant that has committed (or aborted), the coordinator commits (or aborts) the transaction, maintaining the atomicity property.
If there is at least one participant at the prepare-to-commit state, and at least participants waiting for the result of the votes, the coordinator sends prepare-to-commit to the participants and continues to the next step.
Alternatively, if there’s no participant at the prepare-to-commit state and at least participants waiting for the results vote, the coordinator sends a prepare-to-abort message.
Note that this message does not exist in the commit protocol, but only in the termination protocol.
The last phase of the termination protocol waits for acknowledgments and attempts to complete the transaction in a similar fashion to the commit protocol.
Merge protocol
The merge protocol is simple. It includes a leader election amongst the leaders of the two partitions that are merged, and then the execution of the termination protocol we described.
Example
Let’s examine what would happen in the network partition example from the previous lesson. We have shown the same illustration below.
In this case, we have three participants , and we assume that the protocol would use quorums of size and . As a result, during the network partition, the participant on the left side of the partition is unable to form a commit quorum. On the other hand, the participants on the right side of the partition are able to form an abort quorum, and proceed with aborting the transaction, and assume no more partitions happen.
Later on, when the network partition recovers, the merge protocol executes. The merge protocol ensures that the participant from the left side of the partition also aborts the transaction, since the new coordinator identifies at least one node that aborted the transaction.
The following illustration contains a visualization of this execution. An interesting property of the protocol is that we can tune the values of the quorums , , and effectively adjust the protocol’s tendency to complete a transaction via commit or abort in the presence of a partition.
Conclusion
The quorum-based commit protocol satisfies the safety property that ensures that all participants will always arrive at the same decision (atomicity). It does not satisfy the liveness property that ensures that it will always make progress, since there are always degenerate, extreme failure cases (e.g., multiple, continuous, and small partitions). However, the quorum-based commit protocol is much more resilient that 2PC and other protocols, and can make progress in the most common types of failures.
Create a free account to view this lesson.
Continue your learning journey with a 14-day free trial.
By signing up, you agree to Educative's Terms of Service and Privacy Policy