Mike Wang
Mike Wang

Reputation: 41

Use Astyanax to list all rows in Cassandra column family

I have some Cassandra related questions:

I have to store some data (about 10M rows) (let's say a natural key - sortable, update timestamp, createDate (YYYYMMDD only) and a value field. I plan to create the following CF

CREATE TABLE data (
  id text,
  createdate text,
  updatedate timeuuid,
  value text,
  PRIMARY KEY (id, updatedate)
);

CREATE TABLE data_createdate (
  id text,
  createdate text,
  value text,
  PRIMARY KEY (id, createdate)
);

My usage query will be like:

I am using Astyanax, how do I do paging? Do I have to enable partitioner as order-preserved, so I can use token(id) in a range value to page through.

Again, how do I do paging?

Upvotes: 2

Views: 1480

Answers (2)

Theo
Theo

Reputation: 132862

In general you want to avoid anything that requires iterating over all keys in a column family. Just as in an RDBMs you should only do queries that have proper indexes set up.

Since updatedate is part of the compound row key for the data table you can use range queries on that column to do paging (exactly how to do paging in Cassandra is a pretty complex topic, unfortunately). This means that your two first use cases are actually the same.

I'm not really sure what you mean by the third case, do you mean that you want to query rows in data with a range query on createdate -- e.g. SELECT * FROM data WHERE createdate > '20130206' AND createdate < '20130228'? I'm confused by your second table (data_createdate) and where it fits in.

If you mean what I think you mean one solution could be to add a secondary index to the createdate column of data (CREATE INDEX data_createdate_index ON data (createdate)). You can read more about secondary indexing in the documentation.

Upvotes: 1

abhi
abhi

Reputation: 4792

If you want to achieve paging, then try to store last key from the last retrieved set, so that when next time, you want to get the next page slice, your query's entry point will be last saved key. Will suggest you to go through this link http://www.datastax.com/docs/1.2/cql_cli/using/paging.

Upvotes: 1

Related Questions