Josh
Josh

Reputation: 376

Cassandra data model approach

Forgive me for asking something that is probably explained elsewhere, but I am having trouble designing a data model in Cassandra.

I am storing transactions. These transactions each have a source (user), a timestamp, and some associated keywords. I need to be able to find transactions given the source and a date range and (optional) keywords. Cassandra is attractive because I need to store billions of transactions.

I have been unable to find a resource that explains how to do this type of thing. My initial thoughts involve having a few CFs - a transaction CF, a keyword_transaction CF, a source_transaction CF, and possible a day_transaction CF (or something similar). This would make it very straight forward to find transactions based on any one of the above items, but it doesn't seem like it will let me search on all of the above items.

Any thoughts?

Upvotes: 2

Views: 517

Answers (1)

Jasonw
Jasonw

Reputation: 5064

Start by thinking your query and then to your data model. Read here and here as this help when you plan for your data model.

cf : transactions
rowkey : source/uuid (suggestion)
  cn : source  
  cv : UTF8
  cn : keyword
  cv : UTF8
  cn : date
  cv : DateType
  cn : time
  cv : DateType


cf : keywords
rowkey : keyword
   cn : source
   cv : UTF8

where you will have a standard column family called transactions and a few column name (cn) and its corresponding column value (cv). Each of these transaction are identify by the rowkey. Another standard column family is the keywords where the rowkey would be the keyword.

You can search by source, timestamp or keyword but you do need to index them for the query to work. For example with the above suggestion data structure, you can do these:

  • get all transaction where source is equal to ''
       get transactions where source = '' 
       
  • get all transaction where source is equal to '' and your date > ''
       get transactions where source = '' and date > '';
       
  • get all transaction for date x
       get transactions where date = '';
       
  • get all source name based on keyword
       get keywords['keyword'];
       

Upvotes: 3

Related Questions