Cassandra: making a data model / schema

Question

(Not sure what its called... model.. schema.. super model?)

I have 'n' (uniquely id'd) sensors in 'm' (uniquely id'd) homes. Each of these fires 0 to 'k' times / day (in blocks of 1-5). This data is currently stored in MySQL with a table for each 'home' and a structure of:

time stamp
sensor id
firing count

Im having trouble wrapping my mind around a 'nosql' model of this data that would allow me to find counts of firings by home, time, or sensor.

.. Or maybe this isn't the right kind of data to push to nosql? Our current server is bogging down under the load ( hundreds of millions of rows x hundreds of homes ). Im very interested in finding a data store that allows the scalability of cassandra.

rodrigoap · Accepted Answer

To store firing count by sensor and house:

House_Sensors     <-Column family 
  house_id        <-Key
    sensor_id     <-Column name
    firing_count  <-Column value

Data represented in JSON-ish notation

House_Sensors = {
 house_1 : {
  sensor_1: 3436,
  sensor_2: 46,
  sensor_3: 99,
  ...
 },
 house_2 : {
  sensor_7: 0,
  sensor_8: 444,
  ...
 },
 ...
}

You may want to define another column family with sensor_id as key to store the firing timestamp.

Think what queries you need when designing the schema and denormalize as needed. Repeat data, Cassandra inserts are very fast.

The timestamp of the firing is not stored in House_Sensor column family. Create a new column family for that with sensor_id as key.

This way you can use House_Sensor family to query firing count and what sensor belongs to each house. Use the other column family to query the firing timestamp.

Cassandra: making a data model / schema

Answers (2)

Related Questions