Transactions, Storage Layout, and other Guarantees

Let’s have an overview of the transactions and the physical storage of Kafka, and the provided guarantees by it.

Transactional client

Kafka provides a transactional client that allows producers to produce messages to multiple partitions of a topic atomically.

A transactional client also makes it possible to commit consumer offsets from a source topic in Kafka and produces messages to a destination topic in Kafka atomically. This makes it possible to provide exactly-once guarantees for an end-to-end pipeline. This is achieved through the use of a two-phase commit protocol, where the brokers of the cluster play the role of the transaction coordinator in a highly available manner using the same underlying mechanisms for partitioning, leader election, and fault-tolerant replication.

The coordinator stores the status of a transaction in a separate log. The messages contained in a transaction are stored in their own partitions as usual.

When a transaction is committed, the coordinator is responsible for writing a commit marker to the partitions containing messages of the transactions and the partitions storing the consumer offsets.

Consumers can also specify the isolation level they want to read under, read_committed or read_uncommitted. In the former case, messages that are part of a transaction will be readable from a partition only after a commit marker has been produced for the associated transaction. This interaction is summarised in the following illustration:

Physical storage of Kafka

The physical storage layout of Kafka is simple and it is shown in the following illustration. Every log partition is implemented as a set of segment files of approximately the same size (e.g., 1 GB).

Every time a producer publishes a message to a partition, the broker appends the message to the last segment file. For better performance, segment files are flushed to disk only after a configurable number of messages have been published or a configurable amount of time has elapsed.

Note: This particular behavior of Kafka is configurable through the values log.flush.interval.messages and log.flush.interval.ms. It is important to note that this behaviour has implications in the aforementioned durability guarantees since some of the acknowledged records might be temporarily stored only in the memory of all in-sync replicas for some time until they are flushed to the disk.

Each broker keeps in memory a sorted list of offsets, including the offset of the first message in every segment file.

Kafka employs some more performance optimizations, such as using the sendfile API for sending data to consumers, thus minimizing copying of data and system calls.

Guarantees provided by Kafka

Some of the guarantees provided by Kafka are the following:

  • Kafka appends the messages in the order they are sent by a producer to a particular topic’s partition. If a message M1M_1 is sent by the same producer as a message M2M_2, and M1M_1 is sent first, then M1M_1 will have a lower offset than M2M_2 and appear earlier in the log.

Note: Ordering guarantees are provided only per partition. Users of Kafka can control partitioning, as described before, to leverage the ordering guarantees.

  • As explained earlier, Kafka can provide at-least-once, at-most-once, and exactly-once messaging semantics, depending on the configuration and the type of producers and consumers used.

  • The durability, availability, and consistency guarantees provided by Kafka depend on the specific configuration of the cluster. For example, a topic with a replication factor of N, min.insync.replicas of N/2+1N/2 + 1 and acks=all guarantees zero data loss and availability of the cluster for up to N/2N/2 failures.

Get hands-on with 1400+ tech skills courses.