Kafka's Messaging Guarantees
Let's explore the messaging guarantees provided by Kafka.
We'll cover the following
Kafka can provide at-least-once, at-most-once and exactly-once messaging guarantees through different configurations. Let’s explore each one of them separately:
At-most-once semantics
At-most-once semantics is achieved on the producer side by disabling any retries. If the write fails (e.g., due to a TimeoutException), the producer will not retry the request. Moreover the message might or might not be delivered depending on whether it had reached the broker. However, this guarantees that the message cannot be delivered more than once.
In a similar vein, consumers commit message offsets before they process them. Consequently, each message is processed once in the happy path. However, if the consumer fails after committing the offset but before processing the message. In that case the message will never be processed.
At-least-once semantics
At-least-once semantics is achieved by enabling retries for producers. Since failed requests will now be retried, a message might be delivered more than once to the broker leading to duplicates. However, it’s guaranteed to be delivered at least once.
Note: We are assuming infinite retries in this process. However, practically a maximum threshold of retries is usually performed, in which case a message might not be delivered if this limit is exhausted.
The consumer can process the message first and then commit the offset. This would mean that the message could be processed multiple times if the consumer fails after processing it but before committing the offset.
Exactly-once semantics
Exactly-once semantics is achieved by using the idempotent producer provided by Kafka. This producer is assigned a unique identifier (PID) and tags every message with a sequence number. In this way, the broker can keep track of the largest number per PID and reject duplicates.
The consumers can store the committed offsets in Kafka or an external data store.Suppose the offsets are stored in the same data store where the side-effects of the message processing are stored. In that case, the offsets can be committed atomically with the side-effects, thus providing exactly-once guarantees.
Differentiate between the at most once, at least once, and exactly once semantics in Kafka. Focus on the key difference between the three types of semantics used in Kafka. Write your answer with proper reasoning in the AI assessment widget below.
Get hands-on with 1400+ tech skills courses.