DataModel use case for logging in Cassandra

Question

I am trying to design the application log table in Cassandra,

CREATE TABLE log(
  yyyymmdd varchar, 
  created timeuuid,  
  logMessage text,
  module text, 
  PRIMARY KEY(yyyymmdd, created)
);

Now when I try to perform the following queries it is working as expected,

select * from log where yymmdd = '20182302' LIMIT 50;

Above query is without grouping, kind of global.

Currently I did an secondary index for 'module' so I am able to perform the following,

select * from log where yymmdd = '20182302' WHERE module LIKE 'test' LIMIT 50;

Now my concern is without doing the secondary index, Is there an efficient way to query based on the module and fetch the data (or) Is there a better design?

Also let me know the performance issue in current design.

Alex Ott · Accepted Answer

For fetching based on module and date, you can only use another table, like this:

CREATE TABLE module_log(
  yyyymmdd varchar, 
  created timeuuid,  
  logMessage text,
  module text, 
  PRIMARY KEY((module,yyyymmdd), created)
);

This will allow to have single partition for every combination of the module & yyyymmdd values, so you won't have very wide partitions.

Also, take into account that if you created a secondary index only on module field - you may get problems with too big partitions (I assume that you have very limited number of module values?).

P.S. Are you using pure Cassandra, or DSE?

DataModel use case for logging in Cassandra

Answers (1)

Related Questions