Reputation: 59
I am running into a strange problem using Cassandra 1.2 (DSE 3.1.1). I have a table called JSESSION and here is the structure:
cqlsh> use recommender;
cqlsh:recommender> describe table jsession;
CREATE TABLE jsession (
sessionid text,
accessdate timestamp,
atompaths set<text>,
filename text,
processed boolean,
processedtime timestamp,
userid text,
usertag bigint,
PRIMARY KEY (sessionid, accessdate)
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
CREATE INDEX processed_index ON jsession (processed);
You can see that the table is indexed on the field 'processed' which is boolean. When I started coding on this table, the following query used to work fine:
cqlsh:recommender> select * from jsession where processed = false limit 100;
But now that the size is more than 100,000 (not a large number at all), the query has stopped working suddenly, and I couldn't figure out a workaround yet.
cqlsh:recommender> select count(*) from jsession limit 1000000;
count
--------
142320
cqlsh:recommender> select * from jsession where processed = false limit 100;
Request did not complete within rpc_timeout.
I tried several options, to increase the rpc_timout to 60 seconds, also to start Cassandra with more memory (it is 8GB now), but I still have the same problem. Do you have any solution for this?
The deeper question is what is the right way to model a boolean field in CQL3 so that I can search for that field and update it as well. I need to set the field 'processed' to true after I have processed that session.
Upvotes: 2
Views: 5280
Reputation: 558
You don't have a boolean modeling problem. You just need to paginate the results.
select * from jsession where processed = false and token(sessionid) > token('ABC') limit 1000;
Where 'ABC' is the last session id you read (or '' for the first query). Just keep feeding the token id back into this query until you've read everything.
Upvotes: 2