Reputation: 1276
I'm trying to investigate what are the performance impacts of using cassandra arrays. According to my experiments cassandra generates tombstones when inserting or not incremental updating of arrays (non frozen). However according to cqlsh trace output the tombstones are not read, so they shouldn't have any performance impact ... ?
CREATE TABLE tomb_test (id text PRIMARY KEY, events list<text>);
insert into tomb_test (id, events) values ('1', ['A', 'B']);
bin$ nodetool flush
-- you can see there is "marked_deleted" tombstone for events array
sstabledump node1/data0/spark/test-ef990510057b11e98254712032ed3bea/mc-1-big-Data.db
[
{
"partition" : {
"key" : [ "1" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 62,
"liveness_info" : { "tstamp" : "2018-12-24T14:04:07.188625Z" },
"cells" : [
{ "name" : "events", "deletion_info" : { "marked_deleted" : "2018-12-24T14:04:07.188624Z", "local_delete_time" : "2018-12-24T14:04:07Z" } },
{ "name" : "events", "path" : [ "c7481be0-0784-11e9-8254-712032ed3bea" ], "value" : "A" },
{ "name" : "events", "path" : [ "c7481be1-0784-11e9-8254-712032ed3bea" ], "value" : "B" }
]
}
]
}
]
cqlsh:spark> tracing on
cqlsh:spark> select * from tomb_test ;
-- however when reading from tomb_test, no tombstones are scanned
Read 1 live rows and 0 tombstone cells [ReadStage-3] | 2018-12-24 15:07:02.445000 | 127.0.0.1 | 8357 | 127.0.0.1
PS: When table is created with frozen list type, the tombstone is not created
CREATE TABLE tomb_test (id text PRIMARY KEY, events frozen<list<text>>);
cassandra version: 3.11.3
Upvotes: 1
Views: 205
Reputation: 16410
Since you set the value of the list (not append to it) the insert needs to delete any previous cells for that list since each entry is a cell and the writes do not perform any reads. This delete is a range tombstone, deleting the entire row of cells, not a single cell tombstone. This would shadow any previous data in the events list.
Note: With frozen collections the entire collection is serialized within a single cell so it would be overridden and there is no need to delete.
The Read 1 live rows and 0 tombstone cells
is a little misleading, it actually does read the range tombstone, but theres no cell tombstones. I think the range tombstones were added in that count in CASSANDRA-8527 but on many current versions of cassandra they wont be.
Upvotes: 2