Hello everyone! My name is Maxim, and I’m a QA at Maxilect.
Recently, my colleagues asked me to explain the basics of Kafka that can be useful when testing microservices communicating with each other or interacting with external resources. This article covers the main ideas of my explanation.
For those who have already worked on projects involving Kafka, this is unlikely to contain new information. I’ve tried to highlight the most common scenarios for beginners.
Disclaimer:
Here, we won’t discuss setting up Kafka. Typically, it’s set up and configured by developers on a project. All parameters are defined through configuration files, which testers usually don’t have access to. Instead, we work with what’s already in place — pre-created consumer groups, partitions, and so on.
If you need basic information about Kafka from a developer’s perspective, check out one of our previous articles.
Kafka Tools
The primary tool is the Command Line Interface (CLI), which comes with Kafka and can be downloaded from the official website (https://kafka.apache.org/downloads). Essentially, it’s a set of shell scripts, with the most commonly used ones being for producers, consumers, and viewing topics.
Nowadays, UI wrappers for these scripts with various features are widely used. In some ways, they make interaction more convenient, but it all depends on how Kafka is deployed in a specific project. Sometimes, testers have no influence over whether a UI will be deployed (or which one). For example, on my current project, one of the simplest tools is deployed — it has far fewer features than the scripts. There are also projects where, due to strict data consistency requirements, testers don’t have access to producers or consumers at all.
To some extent, this is justified. Imagine if we accidentally send an incorrect message and another service reads it. The whole team might end up searching for issues in other services. Thus, a multifunctional UI for Kafka is often unnecessary.
IDE Plugins
There are also many plugins available for IDEs. The official JetBrains plugin for IntelliJ IDEA is said to be convenient, but I haven’t used it myself. It seems to be available only in the Ultimate version of IDEA.
Main Scenarios for Using Kafka in Testing
In reality, access to Kafka is rarely needed in work. If logs are implemented in a project, many errors can be identified from them. However, there are situations where access to Kafka significantly speeds up testing or simplifies issue analysis.
A simple example is testing the backend of a service that receives data from external systems via Kafka once a day using a cron timer. You extend the model, add support for additional attributes, but waiting another day to test the changes isn’t practical. It’s easier to send a test message with the required data yourself.
Key Kafka Testing Scenarios
Below, we’ll examine the main scenarios for working with Kafka during testing.
Viewing the List of Topics
The name of this case speaks for itself. This scenario is useful for analyzing problems — for example, when the solution being tested is not receiving any messages from Kafka. In such situations, it’s worth checking if the topics are even created in the cluster. Are they named correctly? Additionally, it’s generally helpful to understand the cluster structure.
To view the list of topics, you can use a shell script. For instance, here’s how you can view the list of topics on a local Kafka instance:
./kafka-topics.sh — list — bootstrap-server localhost:9092
This script can accept many arguments, but the basic ones are — list and the server to which we are connecting.
Writing a Message to a Topic
For testing purposes, it’s sometimes necessary to place messages into a topic. For example, if a consumer is already implemented in the project but the producer is either not implemented yet, not working correctly, or unavailable for some reason.
Writing messages to a topic may also be required when testing negative cases that the implemented producer cannot reproduce. For instance, you may need to send an invalid JSON with test data that the producer simply cannot generate.
To do this, you can use the following script:
./kafka-console-producer.sh — topic quickstart-events — bootstrap-server localhost:9092
This script opens a pipeline, allowing you to write various messages until you terminate its execution.
Reading Messages from Topics
The reverse situation occurs when a producer is implemented and writes messages to a topic, but the consumer is either not implemented, not working correctly, or unavailable. For example, if you don’t have direct access to logs or need to read live messages directly from Kafka for some reason.
In this case, you can use the following script:
./kafka-console-consumer.sh — topic quickstart-events — from-beginning — bootstrap-server localhost:9092
This command will display all the messages present in the topic. In real-world production systems, this script may produce a large volume of data, so it’s often wise to redirect the command output to a file and process it using filtering tools like grep.
Pro Tip: Before running the command, consult with developers on how to properly read your specific topics implemented in Kafka. This reduces the likelihood of disrupting the application’s functionality.
Viewing the List of Consumers
Sometimes, it’s necessary to see the consumer groups that are currently reading messages from topics. For instance, you might need detailed information about partitions or the current offsets for each consumer. This can be useful when analyzing message delivery issues, such as identifying configuration errors where messages are being consumed by a different instance.
For example, the team has created a new version of an application, and during testing, you realize that messages are no longer arriving. Viewing the consumers can help you identify if an older version of the application is still active and consuming messages from your topic, updating the offsets in the process.
To list consumer groups, use the following command:
./kafka-consumer-groups.sh — list — bootstrap-server localhost:9092
To display more detailed information, you can use additional arguments such as — all-groups or — describe.
Resetting Offsets
Resetting offsets is a relatively rare use case. For example:
- Reprocessing all messages: You need to re-read all messages from a topic.
- Skipping invalid data: You want to skip over invalid messages and jump to the most recent ones.
If invalid data is present and the service is configured to not update the offset until it reads all messages (causing a “stuck” situation), you can reset the offset using the — to-latest option to skip over the invalid messages and continue testing.
Another scenario might involve someone consuming messages from a topic, but something goes wrong (e.g., a table is accidentally deleted, and it needs to be repopulated). In this case, you would reset the offset to the earliest messages using the — to-earliest option and process them again.
Here’s how you can reset offsets:
kafka-consumer-groups — bootstrap-server localhost:9092 — group test — topic TopicName — reset-offsets — to-earliest ( — to-latest)
Final Notes
Of course, these are just a few of the most common use cases for Kafka in testing. If you think other scenarios should be included, feel free to share your suggestions in the comments!