Cassandra query 2nd index with pagination become slower when data grow

Question

When I query secondary index with pagination, query becomes slower when data grows.
I thought with pagination, no matter how large your data grow, it takes same time to query one page. Is that true? Why my query get slower?

My simplified table is

CREATE TABLE closed_executions (
  domain_id            uuid,
  workflow_id          text,
  start_time           timestamp,
  workflow_type_name   text,
  PRIMARY KEY  ((domain_id), start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)
  AND COMPACTION = {
    'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  }
  AND GC_GRACE_SECONDS = 172800;

And I create a secondary index as

CREATE INDEX closed_by_type ON closed_executions (workflow_type_name);

I query with following CQL

SELECT workflow_id, start_time, workflow_type_name 
FROM closed_executions 
WHERE domain_id = ? 
AND start_time >= ? 
AND start_time <= ? 
AND workflow_type_name = ?

and code

query := v.session.Query(templateGetClosedWorkflowExecutionsByType,
        request.DomainUUID,
        common.UnixNanoToCQLTimestamp(request.EarliestStartTime),
        common.UnixNanoToCQLTimestamp(request.LatestStartTime),
        request.WorkflowTypeName).Consistency(gocql.One)
iter := query.PageSize(request.PageSize).PageState(request.NextPageToken).Iter()
// PageSize is 10, but could be thousand

Environement:

MacBook Pro
Cassandra: 3.11.0
GoCql: github.com/gocql/gocql master

Observation:
10K rows, within second
100K rows, ~3 second
1M rows, ~17 second

Debug log:

INFO  [ScheduledTasks:1] 2018-09-11 16:29:48,349 NoSpamLogger.java:91 - Some operations were slow, details available at debug level (debug.log)
DEBUG [ScheduledTasks:1] 2018-09-11 16:29:48,357 MonitoringTask.java:173 - 1 operations were slow in the last 5005 msecs:

Cassandra query 2nd index with pagination become slower when data grow

Answers (1)

Related Questions