An Overview of Cassandra Architecture

An Overview of Cassandra Architecture

Apache Cassandra– a name synonymous with availability, reliability and scalability is a NoSQL database with a unique architecture. Let’s delve a little deeper into what the architecture of Cassandra comprises of 

Cassandra clusters 

Cassandra has distributed system architecture. Each Cassandra instance (document confirming to a specific data type definition (DTD)) is called a node. You can add multiple nodes by horizontally scaling to form a cluster. A cluster is a shell in the entire Cassandra database 

An Overview of Cassandra Architecture
An Overview of Cassandra Architecture

( Nodes to clusters by horizontal scaling) 

Column– It is the innermost layer in the Apache Cassandra database. It is divided into different headings. The headings carry the primary data for the specific entry made 

Rows– It is the next layer and are basically classifications under which the columns are divided. 

Cassandra stores data in table and each table is grouped under columns and rows 

Structure of the table in Cassandra 

USE User_data; 

CREATE TABLE User_table ( 

     User_id int, 

     User_name text, 

     User_add text, 

     User_phone text, 

     PRIMARY KEY (User_id) 

); 

Insert into User_data (User_id, User_name, User_add, User_phone )  

                    VALUES(100, ‘P’, ‘San Jose’, ‘9899112453’); 

Insert into User_data (User_id, User_name, User_add, User_phone )  

                        VALUES(101, ‘P Maz’, ‘Alameda’); 

OUTPUT 

An Overview of Cassandra Architecture

Key space– The key space is the outermost layer in the storage. It contains the main data, distributed according to their properties. 

An Overview of Cassandra Architecture

A cluster is the main entry point of the data into the Cassandra database. 

Each node in Cassandra is capable of performing the entire database operations and deal with client requests individually without requiring any master node. The cluster doesn’t hold a single point of failure owing to the distributed nature of architecture. 

One or more nodes in the cluster serve as replicas for the set of data. With data replication among the nodes, Cassandra ensures no single point of failure 

An Overview of Cassandra Architecture

The cluster nodes communicate with each other through Gossip protocol. Gossip keeps one node apprised of the status of other nodes. Every second, one node gossips with three other nodes in the cluster. 

A cluster in Cassandra is divided into racks (group of metal servers collaborating resources like network switches and power source) and data centersSnitch configuration is used to group nodes into racks and data centers. Data replication depends on the configuration of racks and data centers.  

An Overview of Cassandra Architecture

Storage in Cassandra 

Commit Log- It’s the crash recovery mechanism of Cassandra or to sync issues with the server on failure of the data node. All write operations are written to commit log. 

Mem-table– After commit log the data is written to the Mem-table 

SS table- When the contents of the Mem-table reach a threshold value, the data is flushed from it to a disk file called the SS- table 

An Overview of Cassandra Architecture

Data Replication in Cassandra 

Every data entry item in the table is replicated at M hosts where M is the configured replication factor at each instance 

I hope that I have been able to outline the architecture of Apache Cassandra.  

Comment or write to us. 

Get free consultation from our tech experts

Get free consultation from our tech experts

Schedule a discussion
Get free consultation from our tech experts
Get free consultation from our tech experts

Related articles you may would like to read

Connect With Us

Request a Consultation