Apache Kafka is an open source stream processing and message broker software. In this article, you will learn how to install it on CentOS / RHEL 8 server.
Table of Contents:
- What is Apache Kafka?
- Environment Specification
- Updating Software Packages in Linux Server
- Installing Java Development Kit (JDK) on Linux Server
- Installing Apache Kafka Server on CentOS 8
- Create a Topic in Apache Kafka Server
- Conclusion
What is Apache Kafka? :
Kafka is an open-source stream-processing software platform developed by the Apache Software Foundation, written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
Kafka can connect to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library.
Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks which allows Kafka to turn a bursty stream of random message writes into linear writes." (Source: Wikipedia)
Environment Specification:
We are using a minimal CentOS 8 KVM machine with following specifications.
- CPU - 3.4 Ghz (2 cores)
- Memory - 2 GB
- Storage - 20 GB
- Operating System - CentOS 8.2
- Hostname – kafka-01.centlinux.com
- IP Address - 192.168.116.234 /24
Update Linux Software Packages:
Connect with kafka-01.centlinux.com as root user with the help of a ssh client.
Update installed sofware packages on your Linux operating system. We are using CentOS Linux in this installation guide, therefore, you can use dnf command for this purpose.
# dnf update -y
Check the Linux operating system and Kernel version that was used in this installation guide.
# uname -r 4.18.0-193.28.1.el8_2.x86_64 # cat /etc/redhat-release CentOS Linux release 8.2.2004 (Core)
Installing Java Development Kit (JDK) on Linux Server:
Apache Kafka is built using Java programming language, therefore it requires Java Development Kit 8 or later.
JDK 11 is available in standard yum repositories, therefore, you can install JDK 11 by executing following Linux command.
# dnf install -y java-11-openjdk
Installing Apache Kafka Server on CentOS / RHEL 8:
Kafka server is distributed under Apache License 2.0, therefore you can download this software from their offical website.
Copy the URL of your required version of Apache Kafka software from this webpage.
Use the copied URL with wget command to download the Apache Kafka software directly from Linux command line.
# cd /tmp # wget https://downloads.apache.org/kafka/2.6.0/kafka_2.13-2.6.0.tgz
Extract downloaded tarball by using tar command.
# tar xzf kafka_2.13-2.6.0.tgz
Now, move the extracted files to /opt/kafka directory.
# mv kafka_2.13-2.6.0 /opt/kafka
Current versions of Kafka server requires Zookeeper service for distributed configurations. However, it is mentioned in Kafka documentation that
"Soon, ZooKeeper will no longer be required by Apache Kafka."
But for now, you have to install Apache Zookeeper service before Kafka server.
Zookeeper binary scripts are provided with Kafka setup files. You can use it to configure ZooKeeper server.
Create a systemd service unit for Apache Zookeeper.
# cd /opt/kafka/ # vi /etc/systemd/system/zookeeper.service
Add following directived in this file.
[Unit] Description=Apache Zookeeper server Documentation=http://zookeeper.apache.org Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple ExecStart=/usr/bin/bash /opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties ExecStop=/usr/bin/bash /opt/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Similarly, create a systemd service unit for Kafka server.
# vi /etc/systemd/system/kafka.service
Add following directives therein.
[Unit] Description=Apache Kafka Server Documentation=http://kafka.apache.org/documentation.html Requires=zookeeper.service [Service] Type=simple Environment="JAVA_HOME=/usr/lib/jvm/jre-11-openjdk" ExecStart=/usr/bin/bash /opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties ExecStop=/usr/bin/bash /opt/kafka/bin/kafka-server-stop.sh [Install] WantedBy=multi-user.target
Enable and start Apache Zookeeper and Kafka services.
# systemctl daemon-reload # systemctl enable --now zookeeper.service Created symlink /etc/systemd/system/multi-user.target.wants/zookeeper.service â /etc/systemd/system/zookeeper.service. # systemctl enable --now kafka.service Created symlink /etc/systemd/system/multi-user.target.wants/kafka.service â /etc/systemd/system/kafka.service.
Verify the status of Apache Kafka service.
# systemctl status kafka.service
Create a Topic in Apache Kafka Server:
Create a topic in your Apache Kafka server.
# /opt/kafka/bin/kafka-topics.sh --create --topic centlinux --bootstrap-server localhost:9092
Created topic centlinux.
To view the details of the topic, you can use run following script at the Linux command line.
# /opt/kafka/bin/kafka-topics.sh --describe --topic centlinux --bootstrap-server localhost:9092
Topic: centlinux PartitionCount: 1 ReplicationFactor: 1 Configs: segment.bytes=1073741824
Topic: centlinux Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Add some sample events in your topic.
# /opt/kafka/bin/kafka-console-producer.sh --topic centlinux --bootstrap-server localhost:9092
>This is the First event.
>This is the Second event.
>This is the Third event.
>^C#
To view all the events that are inserted into a topic, you can execute following script at Linux command line.
# /opt/kafka/bin/kafka-console-consumer.sh --topic centlinux --from-beginning --bootstrap-server localhost:9092
This is the First event.
This is the Second event.
This is the Third event.
^CProcessed a total of 3 messages
Apache Kafka is successfully installed on CentOS / RHEL 8 and the bootstrap server is running at port 9092.
Conclusion:
In this article, You have learned how to install Apache Kafka server on CentOS / RHEL 8. To improve your skills in this area, we recommend that you should read Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale 1st Edition by O'Reilly Media.
No comments:
Post a Comment