Reputation: 63
I was experimenting with the Timeseries example in cassandra mentioned at http://planetcassandra.org/getting-started-with-time-series-data-modeling/. Now how can I verify that the example in figure 2 (Partition the row depending on the weather station and date) has created only two rows and each row contains two columns?
Regards, Seenu.
Upvotes: 0
Views: 134
Reputation: 9475
You can query each partition key "row" and see how many "columns" are present within it (note that clustered columns in CQL are rows with a common partition key prefix, so the example is really creating what looks like four rows in CQL).
SELECT event_time, temperature FROM temperature_by_day WHERE weatherstation_id='1234ABCD' AND date='2013-04-03';
event_time | temperature
--------------------------+-------------
2013-04-03 07:01:00-0400 | 72F
2013-04-03 07:02:00-0400 | 73F
SELECT event_time, temperature FROM temperature_by_day WHERE weatherstation_id='1234ABCD' AND date='2013-04-04';
event_time | temperature
--------------------------+-------------
2013-04-04 07:01:00-0400 | 73F
2013-04-04 07:02:00-0400 | 74F
Or get all the clustered columns at once:
SELECT event_time, temperature FROM temperature_by_day WHERE weatherstation_id='1234ABCD' and DATE in ('2013-04-03', '2013-04-04');
event_time | temperature
--------------------------+-------------
2013-04-03 07:01:00-0400 | 72F
2013-04-03 07:02:00-0400 | 73F
2013-04-04 07:01:00-0400 | 73F
2013-04-04 07:02:00-0400 | 74F
Or just look at the contents of the entire table:
SELECT * from temperature_by_day ;
weatherstation_id | date | event_time | temperature
-------------------+------------+--------------------------+-------------
1234ABCD | 2013-04-04 | 2013-04-04 07:01:00-0400 | 73F
1234ABCD | 2013-04-04 | 2013-04-04 07:02:00-0400 | 74F
1234ABCD | 2013-04-03 | 2013-04-03 07:01:00-0400 | 72F
1234ABCD | 2013-04-03 | 2013-04-03 07:02:00-0400 | 73F
To see how the data is stored on disk, you can flush the keyspace to disk and then run the sstable2json utility on the data file. This will show that each partition key is stored only once, and the clustering columns are stored in sorted order within the partition key.
root@c1:/var/lib/cassandra/data/tkeyspace/temperature_by_day-e1a74970912211e4aa1ea3121441a41b# sstable2json tkeyspace-temperature_by_day-ka-1-Data.db
[
{"key": "1234ABCD:2013-04-04",
"cells": [["2013-04-04 07\\:01-0400:","",1420054084914905],
["2013-04-04 07\\:01-0400:temperature","73F",1420054084914905],
["2013-04-04 07\\:02-0400:","",1420054155058044],
["2013-04-04 07\\:02-0400:temperature","74F",1420054155058044]]},
{"key": "1234ABCD:2013-04-03",
"cells": [["2013-04-03 07\\:01-0400:","",1420054017282283],
["2013-04-03 07\\:01-0400:temperature","72F",1420054017282283],
["2013-04-03 07\\:02-0400:","",1420054049403031],
["2013-04-03 07\\:02-0400:temperature","73F",1420054049403031]]}
]
Upvotes: 1