Cassandra


Cassandra production scenarios/issues

Production issue: when we are trying to write a select query with 8 lacks ids “in condition “. then we got faced below issue,    To solve the above exception, we used distributed calls in Java client as shown below,

Few Production configurations in cassandra RetryPolicy Three scenarios you can control retry policy for: Read time out: When a coordinator received the request and sent the read to replica(s) but the replica(s) […]


Cassandra query language (CQL) and Cassandra Java Client Example

Cassandra Table structure/Terminology Before going to learn CQL commands, we just need to know terminology in cassandra. RDBMS Cassandra Terminology Database Keyspace Table Column Family Primary key Row Key Column name Column name Column value column value CQL Commands Creating a key-space

Use the keyspace (will use that key space)

Note: key spaces are equivalent to database/schema in RDBMS Get list of key spaces

Create table

Get list […]


Cassandra write and read process

Storage engine Cassandra uses a storage structure similar to a Log-Structured Merge Tree, unlike a typical relational database that uses a B-Tree. Cassandra avoids reading before writing. Read-before-write, especially in a large distributed system, can produce stall in read performance and other problems. Cassandra never re-writes or re-reads existing data, and never overwrites the rows in place. How data is written? Different stages of write process in cassandra Logging data […]


Cassandra Architecture

Cassandra is designed in such a way that, there will not be any single point of failure. There is no master- slave architecture in cassandra. cassandra addresses the problem of SPOF by employing a peer-to-peer distributed system across homogeneous nodes where data is distributed among all nodes in the cluster. In cassandra all nodes are same. There will not be any master or slave in cassandra. Each node frequently exchanges state information about itself […]


CAP Theorem

What is CAP Theorem? CAP describes that before choosing any Database (Including distributed database), Basing on your requirement we have to choose only two properties out of three. Consistency  – Whenever you read a record (or data), consistency guaranties that it will give same data how many times you read. Simply we can say that each server returns the right response to each request, thus the system will be always […]


Cassandra Interview Cheat Sheet

Cassandra Cassandra is a distributed database from Apache that is highly scalable and designed to manage very large amounts of structured data. It provides high availability with no single point of failure. NoSQL The primary objective of a NoSQL database is to have simplicity of design, horizontal scaling, and finer control over availability. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts […]


Cassandra Overview

Cassandra Overview Cassandra is another no-sql database. Similar to Hbase it is also distributed column-oriented database to handle big data workloads across multiple nodes but it can support both Local File system and HDFS, whereas in Hbase the underlying file system is also HDFS. It overcomes single point of failure by using a peer-to-peer distributed system across homogeneous nodes where data is distributed among all nodes in the cluster. Each […]


Review Comments
default gravatar

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA

.