Apache Kafka Client Libraries

Learn about the popular clients available for Kafka.

Kafka client libraries

The Kafka API is based on a TCP-based binary protocol that defines all client interactions in the form of request-response messages. Instead of client applications directly using this protocol, Kafka provides client libraries in multiple programming languages that abstract away the protocol details and provide a high-level API for interacting with the Kafka cluster. These are made available in the form of the Producer, Consumer, and Admin APIs. The Producer and Consumer APIs allow client applications to write and read data from Kafka topics, respectively, while the Admin API allows client applications to manage topics, brokers, and other Kafka cluster resources.

These client libraries are part of the core Kafka project and are available in multiple programming languages. Some of the most popular ones are for C/C++, Java, Python, Go, Node.js, and .NET.

Press + to interact
Apache Kafka client library ecosystem
Apache Kafka client library ecosystem

C/C++ client

librdkafka is a C/C++ client library for Kafka. This client is quite important because other client libraries, such as .NET, Go, Python, etc., use librdkafka. Since these libraries are wrappers around librdkafka, they benefit from its performance and stability. Any changes to the protocol or Kafka cluster are implemented in librdkafka and then ported to all the clients that use it.

This brings us to the concept of native and wrapper clients.

Native vs. wrapper clients

Native clients are those that are written in the language they are intended for, without using other shared libraries (for example, the Java client). Wrapper clients are those that are written in a different language and wrap around a native client. For example, the Python client is written in Python and wraps around the librdkafka client.

In the Kafka ecosystem, it’s common to see a mix of native and wrapper clients, even for the same language. Both have their own advantages and disadvantages. Wrapper clients have an added dependency (on librdkafka) that can make it difficult to install and use. Native clients, on the other hand, are easier to install and use, but they are not as performant as wrapper clients. Wrapper clients tend to keep up with the latest protocol changes and are more likely to support newer features.

Here is a table highlighting these differences:

Native vs. Wrapper Client Libraries

Attribute

Native library

Wrapper library

Ease of use

Easier to install and use

Difficult to install and use due to the added librdkafka dependency

Latest features

Takes more time to build native implementations of new client side Kafka features

More likely to support newer features since they are implemented into librdkafka

Performance

Tends to be mode performant

Performance will have the overhead of invoking librdkafka

Java

The Java client is part of the core Kafka project. It will be covered in detail throughout the course.

Python

The Python ecosystem has the following two popular clients for Kafka.

  • kafka-python: This is a native Python client for Kafka. It is a pure Python implementation that does not depend on other libraries. It is a mature client with a large community and is actively maintained. Its design is similar to that of the Java client with the help of Pythonic interfaces such as consumer iterators.

  • confluent-kafka-python: This is a wrapper around librdkafka with high-level Producer, Consumer, and AdminClient API implementations. It is supported and actively maintained by Confluent.

Go

Go is a popular language for working with Kafka. There are many Kafka client library options when it comes to Go, such as:

  • sarama: This is a native Go implementation of the Kafka protocol. It is a mature client with a large community and is actively maintained.

  • confluent-kafka-go: This is a lightweight wrapper around librdkafka and is supported and actively maintained by Confluent.

  • segmentio/kafka-go: This is a native client that provides both low and high-level APIs for interacting with Kafka to mirror concepts and implement the interfaces of the Go standard library to make it easy to use and integrate with existing software.

  • franz-go: This is another native client and supports transactions, regex topic consuming, the latest partitioning strategies, data loss detection, closest replica fetching, and more. The interesting part about the franz-go client is that it uses code generation for client implementation, which implies that it can support any Kafka protocol-level additions or modifications.

.NET

  • confluent-kafka-dotnet: This is a wrapper around librdkafka and is supported and actively maintained by Confluent. The confluent-kafka-dotnet wrapper is derived from the rdkafka-dotnet library. This client provides the following five packages (available via the NuGet package manager):

    • Confluent.Kafka: This is the core client.

    • Confluent.SchemaRegistry.Serdes.Avro: This provides a serializer and deserializer for working with the Avro serialized data and Confluent Schema Registry integration.

    • Confluent.SchemaRegistry.Serdes.Protobuf: This provides a serializer and deserializer for working with the Protobuf serialized data and Confluent Schema Registry integration.

    • Confluent.SchemaRegistry.Serdes.Json: This provides a serializer and deserializer for working with JSON serialized data and Confluent Schema Registry integration.

    • Confluent.SchemaRegistry: This is a Confluent Schema Registry client (a dependency of the Confluent.SchemaRegistry.Serdes package).

Node.js

There are a couple of clients for Node.js as well: kafka-node and kafkajs. However, the kafka-node client is outdated and not maintained. On the other hand, the kafkajs client is widely used and an actively maintained native client. It supports producers, consumer groups (with pause, resume, and seek), and transactions, as well as AWS IAM-based authentication.

Conclusion

In this lesson, we explored the different types of Kafka clients (native and wrapper) and some widely used libraries across popular programming languages.