Reputation: 7722
For our real high traffic site we've set up a own tracking solution. We log >1k impressions per second in peak. For that we're using latest Cassandra 3.
Now we want to set up a real time monitoring on this data. The problem is, that the actually logged data is not structured well for generating statistics from it. So I thought about a new table in Cassandra which has a matching partition/primary key and that is filled by an additional INSERT
. But I'm not sure if this is a killer for Cassandra. As I've said, only live statistics are important, so I want to add a TTL of, let's say 60 seconds to all data in this monitoring table. This should ensure that old data is deleted automatically.
But can anyone say if this leads to problems with this high traffic, because there where so many deletions per minute? As we select only last 5-10 seconds from this monitoring table, the tombstone may not be a problem in SELECT
, but I assume there could be massive compactions and GC that ruin the performance!?
Upvotes: 1
Views: 426
Reputation: 21
Yes, apparently it would.
Having a short TTL means introducing too many tombstone in your system which may result in:
1.query abortion
2.out of memory and heap pressure
3.Latency
in such case you should run compaction very frequently to evict out tombstone from system but this also comes with cons of resource, space consuming and can cause high IO.
Upvotes: 2