Reputation: 4231
I have the following table users((username), last_seen)
where last_seen is basically the writetime . Number of records in the table is aprox 10 million.
Insert is pretty much straightforward insert into users (username, last_seen) VALUES ([username], now)
BUT I need to query by the last_seen
column (this query runs every minute) e.g :
select username from users where last_seen < (now - 1 day)
I have two options as I see it:
use materialized view :
CREATE MATERIALIZED VIEW users_last_seen AS
SELECT last_seen, username
FROM users
WHERE last_seen IS NOT NULL
PRIMARY KEY (last_seen, username);
and simply query
select username from users_last_seen where last_seen < (now - 1 day)
query the users table
select username from users where last_seen < (now - 1 day) ALLOW FILTERING
which one is more efficient? AFAIK materialized view is unstable and have impact on performance.
Upvotes: 0
Views: 1026
Reputation: 4231
I found out that using SASI Index is the best option in this situation
Upvotes: 0
Reputation: 783
AFAIK materialized view is unstable
At this point of time, I think so too. But that's not a reason to use ALLOW FILTERING. If not more, it is equally bad.
I would suggest to create another table or change the current structure.
CREATE TABLE IF NOT EXISTS user_status (
day date,
id timeuuid,
user text,
PRIMARY KEY ((day), id)
);
This table is partitioned by each day. You will need to only query over that day's data and prepare the your data on client side program.
select * from user_status where day = '2015-05-03'
This is both not heavy on server and client side. Depending on the expected size of data, the partition key can be further tuned.
Hope this helps!
Upvotes: 1