Redth
Redth

Reputation: 5544

MongoDB Many Indexes vs. Single Index on array of Sub-Documents?

Wondering which would be the more efficient technique for indexing my document's various timestamps that I need to keep track of, keeping in mind my application is fairly heavy on writing, but heavy enough on reading that without the indexes, the queries are too slow.

Is it better to have a field for each timestamp, and index each field, or store the timestamps and their associated type in an array field, and index each field of that array?

First option, separate fields, and an index for each:

{
    "_id" : "...",
    "Field1.Timestamp" : '2011-01-01 01:00.000',
    "Field2.Timestamp" : '2011-01-01 01:00.000',
    "Field3.Timestamp" : '2011-01-01 01:00.000',
    "Field4.Timestamp" : '2011-01-01 01:00.000',
    "Field5.Timestamp" : '2011-01-01 01:00.000',
    "Field6.Timestamp" : '2011-01-01 01:00.000',
    "Field7.Timestamp" : '2011-01-01 01:00.000',
    "Field8.Timestamp" : '2011-01-01 01:00.000',
    "Field9.Timestamp" : '2011-01-01 01:00.000',
}

db.mycollection.ensureIndex({ "Field1.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field2.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field3.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field4.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field5.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field6.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field7.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field8.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field9.Timestamp" : 1 });

Then there's an array of the timestamps and their status, with only a single index

{
    "_id" : "...",
    "Timestamps" : [
        { "Type" : "Field1", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field2", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field3", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field4", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field5", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field6", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field7", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field8", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field9", "Timestamp" : '2011-01-01  01:00.000' },
    ]
}

db.mycollection.ensureIndex({ "Timestamps.Type" : 1, "Timestamps.Timestamp" : 1 });

Am I way off the mark here? or which would be the better way

Upvotes: 3

Views: 1203

Answers (1)

Remon van Vliet
Remon van Vliet

Reputation: 18595

This basically boils down to if 10 index of size N are more efficient than one index of size N * 10. If you purely look at reads then the seperate indexes should always be faster. The associated b-tree walks will examine a smaller keyset etc.

There are a couple of points to consider though :

  • Indexes on array fields basically index each array element seperately. As such the lookup overhead will at most be 1-2 additional steps during the b-tree walk which is a negligible performance hit. In other words, they'll be almost as fast.
  • Having 10 indexes may mean each update/insert will require more than one index to be updated (depending on if your indexes share a field or if you update more than 1 timestamp at a time). This is a significant performance consideration.
  • Using an array index makes it a bit easier to add additional timestamps (e.g. Timestamp10).
  • There is a limit to the number of namespaces you can use per database (24k) and each index takes up one. If you make a seperate index per field this might become an issue.
  • Most importantly, the array index is way more straightforward and will simplify your code and thus maintainability. Given the limited performance differences I'd say this is the strongest motivation to go for an array index here.

Upvotes: 2

Related Questions