No SQL databases are non-relational databases. It is an approach to database design which allows storage and retrieval of data in a non-tabular format as that found in relational database. NoSQL databases are present in multiple types based on the type of data model. They are generally used for big data and systems ingesting huge volume of data. No SQL houses all the related data in one single data structure. It is a distributed database where the data is stored on multiple servers ensuring real time availability of the data and continuous functioning of the database even if some of the data is not available online
There are various types of NoSQL databases in which data is organized in diverse data structures
- Key value pair- In this every data item has a unique identifier called the key corresponding to the value, which may be the array of data. Ex: Redis
- Document Store- They store data in the form of data schema or documents in the database. Keys are associated with these documents. The data is usually stored in JSON/XML or BSON format. Nested documents alongwith key array/ value pair are stored in every document. Ex: Mongo DB
- Wide column store- It allows the database information to be stored in columns. Wide column store use typical tables, rows and columns to store but unlike relational databases the formatting of the column and the names can vary in different rows. Ex: Apache Cassandra and Hbase
- Graph databases- They use nodes or vertices to store data items or entities and the edges to store the relationship between the different items or the entities. Ex: NEO4J
Major NoSQL Databases
Apache Cassandra is the only NoSQL database that ensures zero downtime. It offers linear scalability and peak performance. It can manage petabytes of information and hundreds of concurrent operations per second. The Cassandra interface runs on Cassandra Query language (CQL) which is much similar to SQL allowing developers to get familiar with it easily.
It ensures high performance because every node of the Cassandra is able to perform read and write operations contrary to the traditional databases. Owing to each node being able to perform read and write operations, replication of data to hybrid cloud platforms becomes easy. On failure of a node the users are routed to a nearby healthy node. Cassandra allows you to scale simply by adding more nodes to the cluster. Cassandra is fault tolerant, wide column database whose distributed design is made on Amazon’s Dynamo and the data model is based on Google Bigtable.
Mongo DB is a document-based No SQL database. It works on the concept of collections and documents. It is schemaless. One collection can hold multiple documents. The number of fields and size of the document can vary from one to other. This schemaless architecture makes it flexible. It can scale both vertically and horizontally. Mongo DB can store nested data within the documents. This allows to create complex relationship between data and store them within a single document. This makes working with it and fetching it super easy. Mongo DB nullifies the need for a complex data pipeline by offering ETL (Extract, Transform, Load) framework. It also offers a tool called Map Reduce to build the data pipelines.