Anupam Bagchi
Anupam Bagchi

Reputation: 59

How to model boolean flags in cassandra

I am running into a strange problem using Cassandra 1.2 (DSE 3.1.1). I have a table called JSESSION and here is the structure:

cqlsh> use recommender;
cqlsh:recommender> describe table jsession;

CREATE TABLE jsession (
  sessionid text,
  accessdate timestamp,
  atompaths set<text>,
  filename text,
  processed boolean,
  processedtime timestamp,
  userid text,
  usertag bigint,
  PRIMARY KEY (sessionid, accessdate)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

CREATE INDEX processed_index ON jsession (processed);

You can see that the table is indexed on the field 'processed' which is boolean. When I started coding on this table, the following query used to work fine:

cqlsh:recommender> select * from jsession where processed = false limit 100;

But now that the size is more than 100,000 (not a large number at all), the query has stopped working suddenly, and I couldn't figure out a workaround yet.

cqlsh:recommender> select count(*) from jsession limit 1000000;

 count
--------
 142320

cqlsh:recommender> select * from jsession where processed = false limit 100;
Request did not complete within rpc_timeout.

I tried several options, to increase the rpc_timout to 60 seconds, also to start Cassandra with more memory (it is 8GB now), but I still have the same problem. Do you have any solution for this?

The deeper question is what is the right way to model a boolean field in CQL3 so that I can search for that field and update it as well. I need to set the field 'processed' to true after I have processed that session.

Upvotes: 2

Views: 5280

Answers (1)

Andrew Weaver
Andrew Weaver

Reputation: 558

You don't have a boolean modeling problem. You just need to paginate the results.

select * from jsession where processed = false and token(sessionid) > token('ABC') limit 1000;

Where 'ABC' is the last session id you read (or '' for the first query). Just keep feeding the token id back into this query until you've read everything.

See also http://www.datastax.com/documentation/cql/3.1/webhelp/index.html#cql/cql_reference/../cql_using/paging_c.html

Upvotes: 2

Related Questions