otmezger
otmezger

Reputation: 10774

mongodb index. how to index a single object on a document, nested in an array

I have the following document:

{
    'date': date,
    '_id': ObjectId,
    'Log': [
        {
            'lat': float,
            'lng': float,
            'date': float,
            'speed': float,
            'heading': float,
            'fix': float
        }
    ]
}

for 1 document, the Log array can be some hundred entries.

I need to query the first and last date element of Log on each document. I know how to query it, but I need to do it fast, so I would like to build an index for that. I don't want to index Log.date since it is too big... how can I index them?

Upvotes: 0

Views: 198

Answers (2)

Neil Lunn
Neil Lunn

Reputation: 151082

So there was an answer about indexing that is fundamentally correct. As of writing though it seems a little unclear whether you are talking about indexing at all. It almost seems like what you want to do is get the first and last date from the elements in your array.

With that in mind there are a few approaches:

1. The elements in your array have been naturally inserted in increasing date values

So if the way all writes that are made to this field is done, only with use of the $push operator over a period of time, and you never update these items, at least in so much as changing a date, then your items are already in order.

What this means is you just get the first and last element from the array

db.collection.find({ _id: id },{ Log: {$slice: 1 }});    // gets the first element
db.collection.find({ _id: id },{ Log: {$slice: -1 }});   // gets the last element

Now of course that is two queries but it's a relatively simple operation and not costly.

2. For some reason your elements are not naturally ordered by date

If this is the case, or indeed if you just can't live with the two query form, then you can get the first and last values in aggregation, but using $min and $max modifiers

db.collection.aggregate([

    // You might want to match first. Just doing one _id here. (commented)
    //{"$match": { "_id": id }},

    //Unwind the array
    {"$unwind": "$Log" },

    //
    {"$group": { 
        "_id": "$_id",
        "firstDate": {"$min": "$Log.Date" },
        "lastDate": {"$max": "$Log.Date" }
    }}

])

So finally, if your use case here is getting the details of the documents that have the first and last date, we can do that as well, mirroring the initial two query form, somewhat. Using $first and $last :

db.collection.aggregate([

    // You might want to match first. Just doing one _id here. (commented)
    //{"$match": { "_id": id }},

    //Unwind the array
    {"$unwind": "$Log" },

    // Sort the results on the date
    {"$sort": { "_id._id": 1, "Log.date": 1 }},

    // Group using $first and $last
    {"$group": { 
        "_id": "$_id",
        "firstLog": {"$first": "$Log" },
        "lastLog": {"$last": "$Log" }
    }}

])

Your mileage may vary, but those approaches may obviate the need to index if this indeed would the the only usage for that index.

Upvotes: 1

Maksym Strukov
Maksym Strukov

Reputation: 2689

In fact it's hard to advise without knowing how you work with the documents. One of the solutions could be to use a sparse index. You just need to add a new field to every first and last array element, let's call it shouldIndex. Then just create a sparse index which includes shouldIndex and date fields. Here's a short example:

Assume we have this document

{"Log": 
    [{'lat': 1, 'lng': 2, 'date': new Date(), shouldIndex : true}, 
    {'lat': 3, 'lng': 4, 'date': new Date()}, 
    {'lat': 5, 'lng': 6, 'date': new Date()}, 
    {'lat': 7, 'lng': 8, 'date': new Date(), shouldIndex : true}]}

Please note the first element and the last one contain shouldIndex field.

db.testSparseIndex.ensureIndex( { "Log.shouldIndex": 1, "Log.date":1 }, { spar
se: true } )

This index should contain entries only for your first and last elements.

Alternatively you may store first and last elements date field in a seperate array.

For more info on sparse indexes please refer to this article.

Hope it helps!

Upvotes: 1

Related Questions