How to design Cassandra Scheme for User Actions Log?

Question

I have a table like this in MYSQL to log user actions :

CREATE TABLE `actions` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `module` VARCHAR(32) NOT NULL,
    `controller` VARCHAR(64) NOT NULL,
    `action` VARCHAR(64) NOT NULL,
    `date` Timestamp NOT NULL,
    `userid` BIGINT(20) NOT NULL,
    `ip` VARCHAR(32) NOT NULL,
    `duration` DOUBLE NOT NULL,
    PRIMARY KEY (`id`),
)
COLLATE='utf8mb4_general_ci'
ENGINE=MyISAM
AUTO_INCREMENT=1

I have a MYSQL Query Like this to find out count of specific actions per day :

SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND 
YEAR(date)=2017 AND MONTH(date)=06 GROUP BY YEAR(date), MONTH(date), 
DAY(date)

this takes 50 - 60 second to me to have a list of days with count of "join" action with only 5 million rows and index in date and action.

So, I want to log actions using Cassandra, so How can I design Cassandra scheme and How to query to get such request less than 1 second.

undefined_variable · Accepted Answer

CREATE TABLE actions (
    id timeuuid,
    module varchar,
    controller varchar,
    action varchar,
    date_time timestamp,
    userid bigint,
    ip varchar,
    duration double,
    year int,
    month int,
    dt date,
    PRIMARY KEY ((action,year,month),dt,id)
);

Explanation: With abobe table Defination

SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND yaer=2017 AND month=06 GROUP BY action,year,month,dt

will hit single partition. In dt column only date will be there... may be you can change it to only day number with int as datatype and since id is timeuuid.. it will be unique.

Note: GROUP BY is supported by cassandra 3.10 and above

How to design Cassandra Scheme for User Actions Log?

Answers (1)

Related Questions