Reputation: 832
I am new to cassandra and I am trying to figure out why I cannot order my logs by the created_at date. The following are the table description, the select result and the select statement I am trying to create.
cassandra@cqlsh:mytable> DESCRIBE TABLE mytable.log;
CREATE TABLE mytable.log (
id uuid,
created_at timestamp,
deleted boolean,
level text,
message text,
obj text,
obj_name text,
origin text,
user int,
PRIMARY KEY (id, created_at)
) WITH CLUSTERING ORDER BY (created_at DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX deleted_idx ON mytable.log (deleted);
CREATE INDEX level_idx ON mytable.log (level);
CREATE INDEX message_idx ON mytable.log (message);
CREATE INDEX origin_idx ON mytable.log (origin);
CREATE INDEX user_idx ON mytable.log (user);
cassandra@cqlsh:mytable> SELECT * FROM mytable.log WHERE "created_at" <= '2015-04-29 00:00:00' AND "user" = 20 LIMIT 10;
id | created_at | deleted | level | message | obj | obj_name | origin | user
--------------------------------------+--------------------------+---------+-------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+-----------------+------
a98a98d5-5710-431b-a23d-d78ece882763 | 2015-04-28 19:18:34-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:34.159619+00:00', 'id': 143L, 'content': u'192.213.216.16', 'change_date': 1430277514, 'owner_id': 20L} | Record | update_a_record | 20
893e9600-3d57-4b82-bdfd-41586023a90f | 2015-04-28 19:21:01-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:21:01.414393+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277661, 'owner_id': 20L} | Record | update_a_record | 20
f951b3ec-092a-4e9e-95c5-a6dce3363c29 | 2015-04-28 19:18:35-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:35.199869+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277515, 'owner_id': 20L} | Record | update_a_record | 20
db60ac52-39e9-4b46-accb-28a34b10579c | 2015-04-28 19:18:37-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:37.650135+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277517, 'owner_id': 20L} | Record | update_a_record | 20
336acc47-6a93-4ff9-a6c5-d29d3b2c4e35 | 2015-04-28 19:23:24-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:23:24.146505+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277804, 'owner_id': 20L} | Record | update_a_record | 20
4ca66f70-36cb-47cc-9324-6a5747d6a592 | 2015-04-28 19:18:48-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:48.242689+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277528, 'owner_id': 20L} | Record | update_a_record | 20
dbfda8bc-f6f2-4b97-b3c1-ccaff21338bb | 2015-04-28 19:18:32-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:32.857508+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277512, 'owner_id': 20L} | Record | update_a_record | 20
6c05779a-d3b8-40ac-84ee-af91a3bf6b15 | 2015-04-28 19:18:47-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:47.181657+00:00', 'id': 143L, 'content': u'192.213.216.16', 'change_date': 1430277527, 'owner_id': 20L} | Record | update_a_record | 20
a037fb9d-cb58-4994-baad-88c441429199 | 2015-04-28 19:18:31-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:18:31.680786+00:00', 'id': 143L, 'content': u'192.213.216.16', 'change_date': 1430277511, 'owner_id': 20L} | Record | update_a_record | 20
66ee42af-6770-4ef8-a300-764246ccc8ff | 2015-04-28 19:20:33-0400 | False | net | updated | {'prio': None, 'type_id': u'A', 'auth': None, 'is_free': False, 'ttl': 300L, 'active': True, 'domain_id': 32L, 'ordername': None, 'name': u'myrecord.mytable.net', 'created': '2015-04-14 17:44:23+00:00', 'modified': '2015-04-29 03:20:33.336544+00:00', 'id': 143L, 'content': u'192.213.15.16', 'change_date': 1430277633, 'owner_id': 20L} | Record | update_a_record | 20
What I don't understand is that it doesn't order by the created_at column in a descending order. My end goal is to store the logs of my app in this table and then be able to only show a few of them in a dashboard that is why I do a limit of 10.
What am I doing wrong here? Regards
Upvotes: 1
Views: 397
Reputation: 57748
What I don't understand is that it doesn't order by the created_at column in a descending order.
Because Cassandra will only enforce a clustering order within a partition key. Your partition key is id
. But that looks like it has an almost unique level of cardinality. So unique, that if you partition on it you won't have any data to within to make sorting worthwhile.
SELECT * FROM mytable.log
WHERE "created_at" <= '2015-04-29 00:00:00' AND "user" = 20 LIMIT 10;
To satisfy this query, you should create a separate query table partitioned by user
, such as logByUser
. You'll want that table to have the same columns, but with a PRIMARY KEY definition like this:
PRIMARY KEY (user, created_at, id)
This PRIMARY KEY definition will allow the following query to function as you expect:
SELECT * FROM mytable.logByUser
WHERE "created_at" <= '2015-04-29 00:00:00' AND "user" = 20 LIMIT 10;
Also, I'd like to point out two things:
Cassandra functions best when you design your data model to fit your query patterns. That may mean creating a table for each query. As crazy as this might sound, creating five or six tables to suit each of your potential queries will perform much better than adding 5 secondary indexes to one table.
Secondary indexes are meant for convenience, not performance. Their use is a known Cassandra anti-pattern. Using them on low-cardinality columns (especially booleans) is asking for trouble. They are not intended as a "magic bullet" to bridge the shortcomings of your data model.
Upvotes: 3