sutanu dalui
sutanu dalui

Reputation: 661

Using the Map datatype in Cassandra (or not)

I will be saving a flattened JSON entity type into Cassandra. I have 2 options for the data model:

((entityType, entityId), jsonPath), value

OR

(entityType, entityId), map<text, text> keyValue

My use case would be, when inserting for each entityId - delete all mappings and insert. And query by entityType, entityId and jsonPath.

Which of the above should be better performing/scalable from a streaming-ingestion-and-UI-query system perspective?

A flattened JSON will have around ~100 fields. Number of entities would be less than a million - in mid hundreds of thousands.

Upvotes: 1

Views: 723

Answers (2)

Aaron
Aaron

Reputation: 57748

Just to add to what Erick said, large collections in Cassandra can lead to other issues. DataStax has some documentation on how to "freeze" collections to help with different access patterns. The tradeoff, is that non-frozen collections can generate a LOT of tombstones during various high write throughput scenarios, and frozen collections must re-write the entire collection on an in-place write.

tl;dr;

Mapping to individual columns is a much better option.

Upvotes: 1

Erick Ramirez
Erick Ramirez

Reputation: 16313

Wherever possible, you will be better off mapping the fields to CQL columns instead of a map collection.

Working with CQL columns equates to simpler CRUD operations. Cheers!

Upvotes: 3

Related Questions