How to group documents which object array has a value less than a given value but the max one

Question

Given this kind of document:

{
    "_id" : "16",
    "name" : "SB15",
    "category_id" : 3,
    "level" : 2,
    "sizes" : [ 
        {
            "v" : 34.78
        }, 
        {
            "v" : 32.15
        }, 
        {
            "v" : 35.86
        }, 
        {
            "v" : 39.4
        }, 
        {
            "v" : 24.01
        }
    ]
}

I need that for each "category_id" value find documents whose "size.v" has the higher value less than a given number and the level is equal to a given number and limit documents to 4 for each category_id

the result should be something like this if I look for a "size.v" less than 35

{
    "_id" : 3, /* the category_id */
    "models" : [
        {
            "name": "SB15",
            "level": 2,
            "size":  {
                "v" : 34.78, /* the higher value in sizes array less than 35 */ 
                "h" : 6.4,
                "w" : 0.20,
                "t" : 0.03
            } 
        }
        ...,
        ...,
        ...,
    ] 
}

Blakes Seven · Accepted Answer

Well the general idea would be to find the difference from the target value and then sort the results:

For MongoDB 2.6 onwards, it's probably best to $map on the array before $unwind, as there is less processing overhead:

db.collection.aggregate([
    // Calculate the difference
    { "$project": {
        "name": 1,
        "category_id": 1,
        "level": 1,
        "sizes": {
            "$map": {
                "input": "$sizes",
                "as": "el",
                "in": {
                    "v": "$$el.v",
                    "diff": {
                        "$cond": [
                            { "$gt": [ 35, "$$el.v" },
                            999999,
                            { "$subtract": [ 35, "$$el.v" ] }
                        ]
                     }
                }
            }
        }
    }},

    // Unwind the array
    { "$unwind": "$sizes" },

    // Sort to smallest difference 
    { "$sort": { "_id": 1, "sizes.diff": 1 } },

    // Get the $first after sort
    { "$group": {
        "_id": "$_id",
        "name": { "$first": "$name" },
        "category_id": { "$first": "$category_id" },
        "level": { "$first": "$level" },
        "size_v": { "$first": "$sizes.v" }
    }},

    // Since you cannot have nested paths in a $group
    { "$project": {
        "name": 1,
        "category_id": 1,
        "level": 1,
        "size": {
            "v": "$size_v"
        }
    }},

    // Group up by category
    { "$group": {
        "_id": "$category_id",
        "models": { 
            "$push": {
                "name": "$name",
                "level": "$level",
                "size": "$size"
            }
        }
    }}
])

In earlier versions $unwind first:

db.collection.aggregate([
    { "$unwind": "$sizes" },
    { "$project": {
        "name": 1,
        "category_id": 1,
        "level": 1,
        "sizes": {
            "v": "$sizes.v",
            "diff": {
                "$cond": [
                    { "$gt": [ 35, "$sizes.v" ] },
                    99999,
                    { "$subtract": [ 35, "$sizes.v" ] }
                ]
            }
        }
    }},
    { "$sort": { "_id": 1, "sizes.diff": 1 } },
    { "$group": {
        "_id": "$_id",
        "name": { "$first": "$name" },
        "category_id": { "$first": "$category_id" },
        "level": { "$first": "$level" },
        "size_v": { "$first": "$sizes.v" }
    }},
    { "$project": {
        "name": 1,
        "category_id": 1,
        "level": 1,
        "size": {
            "v": "$size_v"
        }
    }},
    { "$group": {
        "_id": "$category_id",
        "models": { 
            "$push": {
                "name": "$name",
                "level": "$level",
                "size": "$size"
            }
        }
    }}
])

The main calculatation is done within the $cond, where first you look to see if the value is greater than the target, and if so then assign a "diff" that is a large value. Further away from possible intervals the better.

Where it is actually less than the target, $subtract the value from the target. The smaller the number in result, the closer it is.

But the basic case is to calculate the difference and then $sort so the smallest value is on top. Then you can pick that value using $first as you $group again.

Also note that $group cannot have nested objects as keys, so totals are first made as "size_v" and then renamed to "size.v" in a later $project stage to accomodate the transform.

How to group documents which object array has a value less than a given value but the max one

Answers (1)

Related Questions