Reputation: 3501
Collection: appointments
Schema:
{
_id: ObjectId();
userId: string;
calType: string;
status: string;
appointment_start_date_time: string; //UTC ISO string
appointment_end_date_time: string; //UTC ISO string
}
Example:
{
_id: ObjectId('6332b21960f8083d24f3140b')
userId: "6272ccb3-4050-429c-b427-eb104f340962"
calType: "MY Personal Cal"
status: "CONFIRMED"
appointment_start_date_time: "2022-07-08T03:30:00.000Z"
appointment_end_date_time: "2022-07-08T04:00:00.000Z"
}
I want to create a compound index on userId
, calType
, status
, appointment_start_date_time
Based on Mongo Db's ESR rule I would like to determine the arrangement of my keys.
The documentation conveniently gives an example of 3 keys in compound index where the first key is for equality, second for sort and third for range. But in my case I have more than 3 keys.
I would like to know how would the index keys be arranged for a more efficient compound index. In my case userId
, calType
, status
will be used for equality based match whereas appointment_start_date_time
will be used for sorting.
Potential queries which I will be making on this collection will be:
All appointments where userId = x, calType = y, status = z sort by appointment_start_date_time ASC
All appointments where userId = x, status = z
All appointments where calType = y, status = z
All appointments where userId = x sort by appointment_start_date_time ASC or DSC
What is the standard when we have multiple keys for equality and one for sorting/range?
Upvotes: 0
Views: 440
Reputation: 28316
None of your sample queries use a ranged filter. Assuming none of these fields contain arrays, applying the ESR rule:
Queries 1 and 2 could be optimally served by an index on
{userId:1, status:1, calType:1, appointment_start_date_time:1}
Query 3 would be best server by this index:
{calType:1, status:1}
Query 4 would be best served by:
{userId:1, appointment_start_date_time:1}
In these optimal cases, the MongoDB server could seek to the first matching index key, scan to the last key in a single pass, and encounter the documents in already sorted order.
It may also be possible to get acceptable performance for queries 1,2, and 4 using the index:
{userId:1, appointment_start_date_time:1, status:1, calType:1}
Using this index, query 4 would still be optimal, but query 1 and 2 would require and additional index seek for each status/calType pair. This would be somewhat less performant than the optimal case, but would still be better than an in-memory sort.
Upvotes: 2