Steve
Steve

Reputation: 9583

Azure CosmosDB Cassandra doesn't allow ordering

I am using an Azure CosmosDB with the Cassandra API to store traces from Jaeger. I have been able to create the schema and data is saving correctly, however when I search the traces in Grafana, I get the following error:

ORDER BY requires creating a custom index: CosmosClusteringIndex. Please create a custom index and re-issue this query

The query being run is:

SELECT * FROM jaegertracing.service_name_index WHERE bucket in (0,1,2,3,4,6,7,8,9) AND service_name = 'my-service' AND start_time > 1718728568730810 AND start_time < 1718728668730810 ORDER BY start_time DESC;

The table creation CQL is:

CREATE TABLE IF NOT EXISTS jaegertracing.service_name_index (
    service_name      text,
    bucket            int,
    start_time        bigint, -- microseconds since epoch
    trace_id          blob,
    PRIMARY KEY ((service_name, bucket), start_time)
) WITH CLUSTERING ORDER BY (start_time DESC)
    AND compaction = {
        'compaction_window_size': '1',
        'compaction_window_unit': 'HOURS',
        'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'
    }
    AND default_time_to_live = ${trace_ttl}
    AND speculative_retry = 'NONE'
    AND gc_grace_seconds = 0; 

Is there any way to resolve this error? I have tried adding a secondary index but that doesn't work. Perhaps changing the primary key or clustering order? I am not particularly familiar with Cassandra so if there is a workaround it would be appreciated.

Upvotes: 0

Views: 30

Answers (1)

Erick Ramirez
Erick Ramirez

Reputation: 16373

As I understand it, this is a limitation with Azure Cosmos DB where it is not completely compatible with Apache Cassandra and the API is not fully-compliant with CQL.

I also noted that there was some work carried out by the Jaeger Tracing community to get it to work with Cosmos DB as a storage backend (issue #1667) but noted that it is not officially supported (issue #638).

For the record, I'm not an expert on Jaeger Tracing or Cosmos DB so my comments above are not intended to be a criticism, just stating a fact.

The symptom you described above was previously reported in issue #5185 and the response was there's no workaround for the compatibility issue with Cosmos DB and the recommendation was to use Azure's managed Apache Cassandra instance.

For full disclosure, I'm an Apache Cassandra committer and I work at DataStax. We also have a fully-managed Cassandra-as-a-service Astra DB that allows you to deploy a Cassandra cluster in 3 clicks and 90 seconds on a public cloud of your choice -- AWS, GCP or Azure. Cheers!

Upvotes: 1

Related Questions