Yossi Shasha
Yossi Shasha

Reputation: 203

Cassandra simple primary key queries

We would like to create a Cassandra table with Simple Primary Key that is consisted of UUID column. The table will look like:
CREATE TABLE simple_table( id UUID PRIMARY KEY, col1 text, col2 text, col3 UUID );

This table will potentially store few billions of rows, and the rows should expire after some time (few months) using the TTL feature. I have few questions regarding the efficiency of this table:

  1. What is the efficiency of a query against this table using the primary key? Meaning, how Cassandra finds a specific row after resolving in which partition it resides?
  2. Considering that the rows will expire and create many tombstones, how does this will effect the reads and writes to this table? Let's say that we expire the data after 180 days, if I am not mistaken, the ratio of tombstones would be 10/180~=0.056 (when 10 is the gc_grace_periods in days).

Upvotes: 1

Views: 295

Answers (2)

Yossi Shasha
Yossi Shasha

Reputation: 203

After reading the blog (and the comments) that @Alex referred me to, I concluded that tombstones are created for expired rows due to default_time_to_live of the table. Those tombstones will be cleaned only after gc_grace_periods have passed. See this stack overflow question.

Regarding my first questions this datastax page describes it pretty well.

Upvotes: 0

Alex Ott
Alex Ott

Reputation: 87069

In your case, the primary key is equal to the partition key, so you have so-called "skinny" partitions, consisting of one row. If you remove data, then instead of data inside partition you'll have only tombstone, and it's not a problem. If the data is expired, then it will be simply removed during compaction - gc_grace_period isn't applied here - it's required only when you explicitly remove the data - we need to keep tombstone because other nodes may need to "catch up" with changes if they weren't able to receive delete operation. You can find more details about data deletion in following document.

Problem with tombstones arise when you have many (thousands) of rows inside the same partition, for example, if you use several clustering keys. And when such data is deleted, then the tombstone is generated, and should be skipped when we read data inside partition.

P.S. Have you seen this blog post that explains how deletions happen?

Upvotes: 1

Related Questions