Advance Data Modelling
In ScyllaDB (and Apache Cassandra), data is divided into partitions, rows, and values, which can be identified by a partition key. Sometimes the application needs to find a value by the value of another column. Doing this efficiently without scanning all of the partitions requires indexing
There are three indexing options available in ScyllaDB:
Materialized Views
Global Secondary Indexes
Local Secondary Indexes
Materialized Views (MV) are a global index. When a new MV is declared, a new table is created and is distributed to the different nodes using the standard table distribution mechanisms. The new MV table can have a different primary key from the base table, allowing for faster searches on a different set of columns.
Global Secondary Indexes (also called “Secondary Indexes”) are another mechanism in ScyllaDB which allows efficient searches on non-partition keys by creating an index. Rather than creating an index on the entire partition key, this index is created on specific columns.
Local Secondary Indexes are an enhancement to Global Secondary Indexes, which allow ScyllaDB to optimize workloads where the partition key of the base table and the index are the same key. Like their global counterparts, Cassandra's local indexes are based on Materialized Views.
Materialized Views
Materialized Views (MV) are a global index. When a new MV is declared, a new table is created and distributed to the different nodes using the standard table distribution mechanisms. It’s scalable, just like normal tables. It is populated by a query running against the base table. It’s not possible to directly update a MV; it’s updated when the base table is updated.
Each Materialized View is a set of rows that correspond to rows present in the underlying, or base, table specified in the materialized view’s SELECT statement.
Reads from a Materialized View are just as fast as regular reads from a table and just as scalable. But as expected, updates to a table with Materialized Views are slower than regular updates since these updates need to update both the original table and the Materialized View and ensure the consistency of both updates. However, doing those in the application without server help would have been even slower.
Some common use cases for MV are Indexing with denormalization, different sort orders, and filtering (pre-computed queries).
Example :
Materialized Views and Indexes : https://university.scylladb.com/courses/data-modeling/lessons/materialized-views-secondary-indexes-and-filtering/topic/materialized-views-and-secondary-indexes-hands-on-updated/
Global Secondary Indexes
Secondary indexes are created for one main purpose: to allow querying by a column that is not a key. CQL tables have strict schemas that define which columns form a primary key, and fundamentally we should use these keys to extract data from the database. But, in practice, we may want to occasionally query by a different, regular column, or several of them. How to achieve that? One of the ways is to create an index on that column.
These indexes are implemented on top of materialized views. It implies that there’s a storage overhead for creating an index – it will use another table to store the data it needs. Indexes can be global or local, and let’s find out what that means.
local secondary indexes : https://university.scylladb.com/courses/data-modeling/lessons/materialized-views-secondary-indexes-and-filtering/topic/local-secondary-indexes-and-combining-both-types-of-indexes/
Last updated