Introduction to Kafka
Let's study the goal, concept, and architecture of Kafka in detail.
Apache Kafka is an open-source messaging system initially developed at Linkedin by
Goal of Kafka
The primary goal of Kafka is performance, scalability, durability, and availability.
Performance
The ability to exchange messages between systems with high throughput and low latency.
Scalability
The ability to incrementally scale to bigger volumes of data by adding more nodes to the system.
Durability & availability
The ability to provide durable and available data even in the presence of node failures.
Concept of Kafka
The central concept of Kafka is the topic.
Topic
A topic is an ordered collection of messages. For each topic, there can be multiple producers that write messages to it. There can also be multiple consumers that read messages from it.
Note that Kafka can support both the point-to-point and the publish-subscribe model depending on the number of consumers used.
Achieving performance and scalability
To achieve performance and scalability, each topic is maintained as a partitioned log, stored across multiple nodes called brokers.
Each partition is an ordered, immutable sequence of messages, where each message is assigned a sequential id number called the offset, which uniquely identifies each message within the partition.
Messages by producers are always appended to the end of the log.
Consumers can consume records in any order they like to provide an offset, but usually, a consumer will advance its offset linearly as it reads records.
Note that Kafka provides some useful flexibility, allowing consumers to do things like replaying data starting from an older offset or skipping messages and start consuming from the latest offset.
The following illustration explains the structure of a Kafka topic:
The messages are stored durably by Kafka and retained for a configurable period of time, called retention period, regardless of whether some client has consumed them.
Architecture of kafka
As previously explained, every log is partitioned across multiple servers in a Kafka cluster. Messages written by producers are distributed across these partitions.
Messages can be distributed in a round-robin fashion simply to balance the load. Also, the producer can select the partition according to some semantic partitioning function (e.g., based on some attribute in the message and a partitioning function) so that related messages are stored in the same partition.
Each consumer of a topic can have multiple consumer instances for increased performance, which are all identified by a consumer group name.
Consumption is implemented in such a way that partitions in a log are divided over the consumer instances so that each instance is the exclusive consumer of an equal share of partitions. As a result, each message published to a topic is delivered to one consumer instance within each subscribing consumer group.
Each partition is also replicated across a configurable number of nodes for fault tolerance.
Each partition has one node which acts as the leader and zero or more servers that act as followers.
- The leader handles all read and write requests for the partition.
- The followers passively replicate the leader.
Note that if the leader fails, one of the followers will automatically detect that and become the new leader.
The following illustration explains the Kafka’s architecture:
Log replication and leader election
Kafka uses Zookeeper for various functions, such as leader election between the replica brokers and group membership of brokers and consumers.
Interestingly, log replication is separated from the key elements of the consensus protocol, such as leader election and membership changes. The latter is implemented via Zookeeper, while the former uses a primary-backup replication approach, The leader waits for followers to persist each message before acknowledging it to the client. For this purpose, Kafka has the concept of in-sync replicas (ISR).
In-sync replicas (ISR)
In-sync replicas (ISR) are replicas that have replicated committed records and are thus considered to be in-sync with the leader.
-
In case a leader fails, only a replica in the ISR set is allowed to be elected as a leader. This guarantees zero data loss since any replica in the ISR set locally stores all the records acknowledged by the previous leader.
-
If a follower in the ISR set is very slow and lags, the leader can evict that replica from the ISR set to make progress.
Note that it is important to note that the ISR update is completed before proceeding, e.g., acknowledging records that the new, smaller ISR set has persisted. Otherwise, data loss would be a risk if the leader failed after acknowledging these records but before updating the ISR set. So that the slow follower could be elected as the new leader even though it would be missing some acknowledged records.
Offsets
The leader maintains two offsets, the log end offset (LEO) and the high watermark (HW).
Log end offset (LEO)
LEO indicates the last record stored locally but not replicated or acknowledged yet.
High watermark (HW)
HW indicates the last record that has been successfully replicated and can be acknowledged back to the client.
Get hands-on with 1400+ tech skills courses.