Reputation: 2466
There is a difference in Cassandra between rows created by update and by insert, and it affects behavior of ttl and rows with "all nulls" non-key columns.
Except this behavior, does this have any effect on performance during the creation/deletion/selection of such row?
Link to JIRA describing this behavior:
https://issues.apache.org/jira/browse/CASSANDRA-8430
Upvotes: 1
Views: 3131
Reputation: 2466
1. "Execution plan": Executing same query (select by primary key), source_elapsed column:
Create as Insert:
2266,1768,1672,3302,3324,1422,1623,3833,3933,3519,4166. Avg: 2803
Create as Update:
1621,3498,4769,3680,3905,1781,4215,3764,3747,3460,1987. Avg: 3312
Maybe it looks like Update is a bit slower, but this is not really consistent, and i believe that with higher number of executions they should be same.
2. Storage:
Row created as Insert:
[user1]@184 Row[info=[ts=1486368137507000 ttl=3600, let=1486371737] ]: 2017-01-01 14:00Z, bla, 5,2 | [blu=77777 ts=1486368137507000 ttl=3600 ldt=1486371737], [ble=0 ts=1486368137507000 ttl=3600 ldt=1486371737]
Row created as Update:
[user30]@122 Row[info=[ts=-9223372036854775808] ]: 2017-01-01 14:00Z, bla, 5,2 | [blu=777 ts=1486368139142000 ttl=3600 ldt=1486371739], [ble=1 ts=1486368139142000 ttl=3600 ldt=1486371739]
I assume that sstabledump is indeed representing data as it saved in file. The only difference here that row created as insert is generated with ttl and let columns on the row level (and ts is set to the time created) - this is the cause rows with all null non-key columns are selectable with create as insert and not selectable with create as update. So rows created with insert will use several bytes more storage, that is all the difference here.
3. Tombstones:
Created as Insert:
[user1]@48 Row[info=[ts=-9223372036854775808] ]: 2017-01-01 14:00Z, bla, 5,2 | [blu= ts=1486368407044000 ldt=1486368406], [ble= ts=1486368407044000 ldt=1486368406]
Created as Update:
[user30]@0 Row[info=[ts=-9223372036854775808] ]: 2017-01-01 14:00Z, bla, 5,2 | [blu= ts=1486368403444000 ldt=1486368403], [ble= ts=1486368403444000 ldt=1486368403]
As expected, tombstones looks exactly the same for both creates.
Summary:
From my observation there is no real difference in performance between two types of row creation. I will be happy to see other tests/observations/source code reviews here.
Upvotes: 7