How does cassandra read specific rows with partition key and clustering keys

Question

I was reading "How is data read" at http://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutReads.html. It mentions "Within a partition, all rows are not equally expensive to query. The very beginning of the partition (the first rows, clustered by your key definition) is slightly less expensive to query because there is no need to consult the partition-level index."

So what does Cassandra do after the partition is located to read a specific row or some specific rows? Is it a simple iteration over all rows. Or is there more efficient way to find the offset of specific row?

Shlomi Livne · Accepted Answer

Cassandra has a notion of "promoted index" - which are used in large partitions with many rows (index file format).

In case a specific row is searched in a partition with many rows the promoted index in the index file is used to find the data file portion that holds the information related to the range of rows this row belongs to.

Cassandra 3.6 has improved the promoted index format to allow better search for (new promoted index format)

How does cassandra read specific rows with partition key and clustering keys

Answers (1)

Related Questions