Cassandra data modeling with multi-column filter

Question

I have a Cassandra table stokes which has 4 columns ( item(text) ,market ( text ) , location( text ) , time ( timestamp ) , value( int) ) item is a partition key and market , location and time is a clustering key in same order .

item | market | location | time | value
 x   |  m1    | l1       | t1   | v1
 x   |  m1    | l2       | t2   | v2
 x   |  m1    | l3       | t3   | v3
 y   |  m1    | l1       | t4   | v4
 y   |  m1    | l2       | t5   | v5
 y   |  m1    | l3       | t6   | v6

Application has require to query to Cassandra table in two scenario

For given item , market and location fetch the record by querying to stock table . for example item x, market l1 and location l1 below record will be fetched

 x   |  m1    | l1       | t1   | v1

For given item , market and all location after input time fetch the record by querying to stock table . For example item x , market m1 , all location after t1 time below records will be fetched
```
 x   |  m1    | l2       | t2   | v2
 x   |  m1    | l3       | t3   | v3
```

I understand in Cassandra application requirement and conceptual modeling comes first in deformalizing the data modeling . How I cam model data to meet both of my application requirement .

For my second requirement I am not able to query like

select  from  where item =x and market= m1 and time > t2; // wrong as location missing 
Skipping location is not allowed and locations could be multiple . How to model or query to meet both requirement .

Erick Ramirez · Accepted Answer

For the first app query, the partition key is item and both market and location are clustering columns:

CREATE TABLE items (
    item text,
    market text,
    location text,
    time timestamp,
    value int,
    PRIMARY KEY (item, market, location)
)

For the second app query, it looks similar to the items table except the data is organised by market and time:

CREATE TABLE locations_by_item (
    item text,
    market text,
    location text,
    time timestamp,
    value int,
    PRIMARY KEY (item, market, time)
)

When you retrieve data from this table with:

SELECT * FROM locations_by_item
  WHERE item = ?
  AND market = ?
  AND time > ?

it will return rows of location + value. Cheers!

Cassandra data modeling with multi-column filter

Answers (1)

Related Questions