Reputation: 133
I have built an application receiving metering data (for example, the current temperature of a room) from multiple devices (in this example, multiple rooms).
I receive metering data every 15 minutes. My application calculates the difference between the current temperature and the previous one received and sends it to another application. I store the received metering data in a Cassandra cluster. (timestamp, temperature, device_id, room, ...)
Which field should I use for partitioning?
If I use the timestamp as the partition key will it put all load on the same node? (without regarding replication)?
If I use the device_id/room, won't I get an unbounded partition? Maybe I could add a retention period?
Upvotes: 1
Views: 51
Reputation: 2310
Rule for Cassandra data modeling is design your tables based on your queries. So prepare your queries first. For example if you have queries like
You can have two tables
This is the only way you design tables in Cassandra. Dont try to create table RDBMS way.
Upvotes: 0