Reputation: 317
Here is my Table:
CREATE TABLE mytable
(
id uuid,
day text,
mytime timestamp,
value text,
status int,
PRIMARY KEY ((id, day), mytime )
)
WITH CLUSTERING ORDER BY (mytime desc)
;
Here is the Index:
CREATE INDEX IF NOT EXISTS idx_status ON mytable (status);
When I run this select statement, I get the expected results:
select * from mytable
where id = 38403e1e-44b0-11e4-bd3d-005056a93afd
AND day = '2014-10-29'
;
62 rows are returned as a result of this query.
If I add to this query to include the index column:
select * from mytable
where id = 38403e1e-44b0-11e4-bd3d-005056a93afd
AND day = '2014-10-29'
AND status = 5
;
zero rows are returned. (there are several records with status = 5)
If I query the table...looking ONLY for a specific index value:
select * from mytable
where status = 5
;
zero rows are also returned.
I'm at a loss. I don't understand what exactly is taking place.
I am on a 3 node cluster, replication level 3. Cassandra 2.1.3
Could this be a configuration issue at all..in cassandra.yaml ?
Or...is there an issue with my select statement?
Appreciate the assistance, thanks.
UPDATE:
I am seeing this in the system.log file, ideas? ...
ERROR [CompactionExecutor:1266] 2015-03-24 15:20:26,596 CassandraDaemon.java:167 - Exception in thread Thread[CompactionExecutor:1266,1,main]
java.lang.AssertionError: /cdata/cassandra/data/my_table-c5f756b5318532afb494483fa1828675/my_table.idx_status-ka-32-Data.db
at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:235) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:153) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:76) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.3.jar:2.1.3]
at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:240) ~[apache-cassandra-2.1.3.jar:2.1.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:1.7.0_51]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.7.0_51]
at java.lang.Thread.run(Unknown Source) [na:1.7.0_51]
Upvotes: 1
Views: 2163
Reputation: 57748
I ran your steps above and was able to query rows by status=5
just fine. One thing I can suggest, is to try rebuilding your index. Try this from a command prompt:
nodetool rebuild_index mykeyspace mytable idx_status
Otherwise, IMO the best way to solve this, is not with a secondary index. If you know that you're going to have to support a query (especially with a large dataset) by status
, then I would seriously consider building a specific, additional "query table" for it.
CREATE TABLE mytablebystatus (id uuid, day text, mytime timestamp, value text, status int,
PRIMARY KEY ((status),day,mytime,id));
This would support queries only by status
, or status
and day
sorted by mytime
. In summary, I would experiment with a few different PRIMARY KEY definitions, and see which better-suits your query patterns. That way, you can avoid having to use ill-performing secondary indexes all together.
Upvotes: 1