Reputation: 305
I am cassandra newbie. I am collecting system status every 5 mins so I created this table,
create table sysportal (hostname text, logged_date text, logged_time timestamp, service text, plugin_output text, status text, PRIMARY KEY((hostname, logged_date), logged_time, service, plugin_output, status));
Sample table is,
hostname | logged_date | logged_time | service | plugin_output | status
--------------------------------------------------------------------------------
host1 | 2014-02-21 | 2014-02-21 07:25:30+0000 | disk | DISK OK | ok
host2 | 2014-02-21 | 2014-02-21 07:25:31+0000 | disk | DISK Warning | ok
host1 | 2014-02-22 | 2014-02-22 15:23:50+0000 | disk | DISK OK | ok
host2 | 2014-02-22 | 2014-02-22 15:23:50+0000 | disk | DISK Warning | ok
host1 | 2014-02-23 | 2014-02-23 15:23:50+0000 | load | LOAD OK | ok
host2 | 2014-02-23 | 2014-02-23 15:23:50+0000 | ping | PING OK | ok
How do I get all host's latest data in single query ?
Using python currently I am doing this,
select logged_date, logged_time from sysportal limit 1; => In python save in variables
select hostname from sysportal; => In python get distinct hosts
and then,
for i in hosts:
select service from sysportal where hostname=i and logged_date=va1 and logged_time=var2
Can somebody advice if I can do this with single query in cassandra ? Should I create other tables/column_families ?
Upvotes: 0
Views: 581
Reputation: 10865
Unfortunately, since your hostname is part of your partition key, you can't. Depending on your data size/load you can possibly create a new table that has "logged date" or some other column, as the partition key and has hostname as part of the clustering columns.
So your table could look something like this:
create table sysportal_by_date (
hostname text,
logged_date text,
logged_time timestamp,
service text,
plugin_output text,
status text,
PRIMARY KEY(
logged_date,
logged_time,
hostname,
service,
plugin_output,
status
)
) with clustering order by (logged_time DESC);
This would allow you to run a query that is:
select * sysportal_by_date where logged_date = <today> order by logged_time desc;
Upvotes: 2