Cassandra Performing Queries Efficiently

Look into how Cassandra tries to perform queries efficiently.

We'll cover the following

Methods to perform queries efficiently

Secondary indexes
Materialized views

Trade-offs with secondary indexes and materialized views
Denormalizing data for efficiency

Updating multiple tables

Logged and unlogged batches

In Cassandra, performing a query that does not use the primary key is guaranteed to be inefficient because it will need to perform a full table scan querying all the cluster nodes.

Methods to perform queries efficiently

Two alternatives can be used to solve the above problem:

Secondary indexes
Materialized views.

Secondary indexes

A secondary index can be defined on some columns of a table. This means each node will index this table locally using the specified columns. A query based on these columns will still need to ask all the system nodes, but at least each node will have a more efficient way to retrieve the necessary data without scanning all the data.

Materialized views

A materialized view can be defined as a query on an existing table with a newly defined partition key. This materialized view is maintained as a separate table, and any changes on the original table are eventually propagated to it. As a result, these two approaches are subject to the following trade-off.

Trade-offs with secondary indexes and materialized views

Secondary indexes are more suitable for high cardinality columns, while materialized views are suitable for low cardinality columns as they are stored as regular tables.
Materialized views are more efficient during read operations than secondary indexes because only the nodes that contain the corresponding partition are queried.
Secondary indexes are guaranteed to be strongly consistent, while materialized views are eventually consistent.

Get hands-on with 1400+ tech skills courses.

Before Getting Started

Introduction to Distributed Systems

Basic Concepts and Theorems

Distributed Transactions

Achieving Isolation

Achieving Atomicity

Concluding Distributed Transactions

Consensus

Time

Order

Networking

Security

Security Protocols

From Theory to Practice

Case Study 1: Distributed File Systems

Case Study 2: Distributed Coordination Service

Case Study 3: Distributed Data Stores

Case Study 4: Distributed Messaging System

Case Study 5: Distributed Cluster Management

Case Study 6: Distributed Ledger

Case Study 7: Distributed Data Processing Systems

Practices & Patterns

Communication Patterns

Coordination Patterns

Data Synchronization

Shared-nothing Architectures

Distributed Locking

Compatibility Patterns

Dealing with Failure

Distributed Tracing

Concluding this Course

Cassandra Performing Queries Efficiently

Methods to perform queries efficiently

Secondary indexes

Materialized views

Trade-offs with secondary indexes and materialized views