Reputation: 359
from the Cassandra table how to get records based on date from timestamp column.Table details are
CREATE TABLE hlragent_logs.hlragent_logs_2021 (
msisdn text,
date_time timestamp,
cmd_no text,
agent_id text,
cmd_executed text,
dummy text,
id bigint,
imsi text,
mml_cmd text,
module text,
node text,
node_id text,
node_ip text,
p text,
pno text,
serial text,
vhlr_name text,
PRIMARY KEY (msisdn, date_time, cmd_no)
) WITH CLUSTERING ORDER BY (date_time ASC, cmd_no ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX indx_agentlogs_2021 ON hlragent_logs.hlragent_logs_2021 (imsi)
select * from hlragent_logs_2021 where todate(date_time)="2021-08-10" allow filtering; SyntaxException: line 1:45 no viable alternative at input '(' (select * from hlragent_logs_2021 where todate
Upvotes: 1
Views: 850
Reputation: 16313
It isn't necessary to use the native CQL TODATE()
function to work on timestamp columns. It's possible to directly work on the column with:
SELECT * FROM ... WHERE ... AND date_time < '2021-08-10';
But you cannot use the equality operator (=
) because the CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.
Depending on where you're running the query, the filter could be translated in the local timezone. Let me illustrate with this example table:
CREATE TABLE community.tstamptbl (
id int,
tstamp timestamp,
PRIMARY KEY (id, tstamp)
)
These 2 statements may appear similar but translate to 2 different entries:
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09');
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09 +0000');
The first statement creates an entry with a timestamp in my local timezone (Melbourne, Australia) while the second statement creates an entry with a timestamp in UTC (+0000
):
cqlsh:community> SELECT * FROM tstamptbl WHERE id = 5;
id | tstamp
----+---------------------------------
5 | 2021-08-08 14:00:00.000000+0000
5 | 2021-08-09 00:00:00.000000+0000
Similarly, you need to be precise when reading the data. You need to specify the timezone to remove ambiguity. Here are some examples:
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-09 +0000';
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-10 12:00+0000';
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-08 12:34:56+0000';
Again since timestamps are encoded in milliseconds (instead of days), there is a whole range of possible values for a given date. If I wanted to retrieve all rows for the date 2021-08-09
, I need to filter based on a range as in this example:
SELECT * FROM tstamptbl
WHERE id = 5
AND tstamp >= '2021-08-09 +0000'
AND tstamp < '2021-08-10 +0000';
Cheers!
Upvotes: 1
Reputation: 87154
You can't use the user-defined functions in the WHERE
clause (there is a Jira ticket for it, but I don't remember anyone working on it).
Upvotes: 1