Reputation: 175
I have recently started exploring Cassandra for our project. I have a doubt related to Cassandra data modelling. Lets take an example of google web analytics product. Google collects/aggregates information about the url statistics in different dimensions with different time ranges. Lets take a simple example of collecting access count of www.yahoo.com from desktop browsers vs mobile browsers for a period of 30 days (daily sum). We can model this in 2 ways -
One row key for each browser type for the same url and each day as column name with aggregate counter column type
One generic row key for url and composite key with day, url and browser type with aggregate counter column type
Whats the pros and cons of each approach?
Upvotes: 2
Views: 1005
Reputation: 1022
Long names for column name is not a good idea as they will be stored repeatedly in each row. You should use date,url,platform,day as primary key, and one column for count. This way if you need all days of the month you specify date,url,platform.
Upvotes: 2