Christian Gintenreiter
Christian Gintenreiter

Reputation: 168

Cloud Spanner is using a secondary index when it should not

An existing query that performed quickly using the primary key massively slowed down (10ms -> 8sec) without notice because a secondary index that has been created for another use-case is now used automatically.

The 'Explanation' of the Cloud-Spanner-Web-Query tells me that the secondary index is used. If I change ordering (just for testing purpose) or supply FORCE_INDEX the query is quick again.

I can 'fix' this problem by using FORCE_INDEX=_BASE_TABLE which is documented in the Cloud Spanner Query Syntax Documentation.

My question is: Do I really have to do this for every query to avoid such effects ?

This mixes query-definition with index-definition which is not a good thing IMHO.

Table with primary index:

CREATE TABLE change_history (
    userId INT64 NOT NULL,
    createdAtUnique INT64 NOT NULL,
    itemId STRING(512) NOT NULL,
    newValue FLOAT64 NOT NULL,
    oldValue FLOAT64 NOT NULL,
) PRIMARY KEY (userId, itemId, createdAtUnique DESC)

Secondary index:

CREATE INDEX ch_userid_createdatunique_all ON change_history (
    userId,
    createdAtUnique
) STORING (
    newValue,
    oldValue
)

Original query:

SELECT * FROM change_history WHERE                         
    userId = 2563
    AND itemId = "215414"
    AND createdAtUnique >= 15385766670000000
    AND createdAtUnique <= 15465254670000000 ORDER BY createdAtUnique

I expected the query to keep using the primary key which it was designed for.

But by adding the secondary index the query started to use this one instead of the primary key.

Upvotes: 2

Views: 2132

Answers (1)

adi
adi

Reputation: 590

The query optimizer in this case decided to pick the index because 1) it is covering and 2) avoids the sort in the original plan because the index contains createdAtUnique in ascending sort order, which is the sort order requested in the query. However, for your data distribution this turned out to be a poor choice.

In general, for queries that have been hand-tuned to get a particular plan that you know is optimal/good, it's a good idea to use force_index and join_type hints in the query to protect against the rare instance when the optimizer might pick a different plan.

Upvotes: 5

Related Questions