How to install Cassandra on CentOS 8

Share on Social Media

In this guide, you will learn how to install Cassandra on CentOS 8 and configure initial security. #centlinux #linux #cassandra

What is Apache Cassandra? :

Apache Cassandra is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. (Courtesy: Wikipedia)

Cassandra was initially developed in Java programming language at Facebook to power its Inbox search feature. Facebook released Cassandra as an open-source project on Google code in July 2008. In March 2009 it became an Apache Incubator project. On February 17, 2010 it graduated to a top-level project.

Cassandra was now maintained by Apache Software Foundation and distributed under Apache License 2.0.

Features in Apache Cassandra:

Main features of this NoSQL database management system are.

  • Distributed – Every node in the cluster has the same role
  • Supports replication and multi datacenter replication
  • Highly Scalable
  • Fault-tolerant – Data is automatically replicated to multiple nodes for fault-tolerance
  • MapReduce support – Cassandra has Hadoop integration, with MapReduce support
  • Query language – Cassandra introduced the Cassandra Query Language (CQL)

Environment Specification:

We are using a KVM based CentOS 8 virtual machine with following specification.

  • CPU – 3.4 Ghz (2 cores)
  • Memory – 2 GB
  • Storage – 20 GB
  • Operating System – CentOS Linux 8.2
  • Hostname – cassandra-01.centlinux.com
  • IP Address – 192.168.116.206 /24

Update Linux Software Packages:

Connect with cassandra-01.centlinux.com as root user by using a ssh tool.

As a best practice, update existing software packages in your Linux operating system.

# dnf update -y

Verify version of active Linux kernel by using uname command.

# uname -r
4.18.0-193.6.3.el8_2.x86_64

Verify version of your Linux operating system.

# cat /etc/redhat-release
CentOS Linux release 8.2.2004 (Core)

Install Cassandra Yum Repository:

Apache Software Foundation provides official yum repositories for each version of Cassandra software.

You are required to add the yum repository as mentioned at Cassandra download page.

Create a repo file by using vim text editor.

# vi /etc/yum.repos.d/cassandra.repo

Add following directives in this file.

[cassandra]
name=Apache Cassandra
baseurl=https://downloads.apache.org/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS

Here, 311x is the respective version of Apache Cassandra i.e. 3.11.

It is the latest version at the time of this writing. Therefore, we are using it. If you want to install any other version of Apache Cassandra then you should update the version number in repo file accordingly.

Build yum cache for newly installed yum repository. Accept GPG keys if asked to do so.

# dnf makecache
CentOS-8 - AppStream                            7.3 kB/s | 4.3 kB     00:00
CentOS-8 - Base                                 5.0 kB/s | 3.9 kB     00:00
CentOS-8 - Extras                               162  B/s | 1.5 kB     00:09
Apache Cassandra                                2.1 kB/s | 3.6 kB     00:01
Metadata cache created.

Apache Cassandra 3.11 yum repository has been installed on your Linux server.

Install Cassandra on CentOS 8:

Apache Cassandra requires JVM (Java Virtual Machine) to run. Although, we can explicitly install Java on your Linux server, but if you install Cassandra software by using dnf command, it will automatically installs all required dependencies including Java.

Therefore, you should directly install Cassandra on CentOS 8 server by using dnf command.

# dnf install -y cassandra

cqlsh (Cassandra Query Language Shell) requires Python to run. Therefore, you are also required to install Python as well.

Currently, Apache Cassandra is only compatible with Python 2.7. Therefore, you need to install the same on your Linux server.

# dnf install -y python2

Cassandra service is SystemV based, therefore, you have to use the legacy commands to enable and start it.

# service cassandra start
Starting cassandra (via systemctl):                        [  OK  ]
# chkconfig cassandra on

Verify the status of cassandra.service.

# systemctl status cassandra.service
â cassandra.service - LSB: distributed storage system for structured data
   Loaded: loaded (/etc/rc.d/init.d/cassandra; generated)
   Active: active (running) since Sat 2020-08-01 11:18:50 PKT; 51s ago
     Docs: man:systemd-sysv-generator(8)
 Main PID: 48050 (java)
    Tasks: 50 (limit: 12331)
   Memory: 1.1G
   CGroup: /system.slice/cassandra.service
           ââ48050 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el8_2.x86_64>

Aug 01 11:18:46 cassandra-01.centlinux.com systemd[1]: Starting LSB: distribute>
Aug 01 11:18:46 cassandra-01.centlinux.com runuser[47978]: pam_unix(runuser:ses>
Aug 01 11:18:50 cassandra-01.centlinux.com cassandra[47966]: Starting Cassandra>
Aug 01 11:18:50 cassandra-01.centlinux.com systemd[1]: Started LSB: distributed>

Use the nodetool command to verify the status of your Cassandra cluster.

# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  70.01 KiB  256          100.0%            7d916cdb-8065-42d0-97c0-c88c68b68aa3  rack1

Apache Cassandra has been installed on your Linux server.

Configure Apache Cassandra Security:

Configuration files for Apache Cassandra are located in /etc/cassandra/conf directory.

It is a safe practice to take a backup of the original configuration file, before modifying it. Therefore, create a copy of the original cassandra.yaml configuration file as follows.

# cd /etc/cassandra/conf/
# cp cassandra.yaml cassandra.yaml.bkp

Now, edit this file by using vim text editor.

# vi /etc/cassandra/conf/cassandra.yaml

Locate following parameters in this file.

authenticator: AllowAllAuthenticator
authorizer: AllowAllAuthorizer
roles_validity_in_ms: 2000
permissions_validity_in_ms: 2000

And update their values as follows.

authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer
roles_validity_in_ms: 0
permissions_validity_in_ms: 0

Restart Cassandra service to take changes into effect.

# systemctl restart cassandra.service

Create a Database Admin user:

Connect to cqlsh prompt by using the Cassandra default username/password.

# cqlsh -u cassandra -p cassandra
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.7 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help. 
cassandra@cqlsh> 

Create a database admin user by executing following command.

cassandra@cqlsh> CREATE ROLE ahmer WITH PASSWORD = 'Ahmer@1234' AND SUPERUSER = true AND LOGIN = true;

Exit from cqlsh prompt.

cassandra@cqlsh> exit

Again connect to cqlsh as new admin user.

# cqlsh -u ahmer -p Ahmer@1234
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.7 | CQL spec 3.4.4 | Native protocol v4]

Use HELP for help. 
ahmer@cqlsh> 

For better security it is always advisable to remove/disable the default users. Therefore, revoke admin role and login permissions from cassendra user.

ahmer@cqlsh> ALTER ROLE cassandra WITH PASSWORD = 'cassandra' AND SUPERUSER = false AND LOGIN = false;

Revoke all permissions from cassendra user.

ahmer@cqlsh> REVOKE ALL PERMISSIONS ON ALL KEYSPACES FROM cassandra;

Grant all permissions to new admin user.

ahmer@cqlsh> GRANT ALL PERMISSIONS ON ALL KEYSPACES TO ahmer;

Exit from cqlsh prompt.

ahmer@cqlsh> exit

Apache Cassandra has been configured. It is now ready to become part of a Cassandra cluster.

Conclusion – Install Cassandra on CentOS 8:

In above guide, you have learned how to install Cassandra on CentOS 8, you have also configured recommended security configurations as well. Cassandra: The Definitive Guide: Distributed Data at Web Scale 2nd Edition (PAID LINK) by Jeff Carpenter is a very good book and we strongly recommend that you should read it.

Scroll to Top