Mongo query for an array is a subarray

Question

I'm looking for a query that acts as $setIsSubset, except accounting for duplicate values.

For example, [1,1,2,3] is a subset of [1,2,3,4], because sets don't have duplicate values.

How can I write a query such that [1,1,2,3] is not a subset of [1,2,3,4]?

An example of expected outputs:

INPUT     |  TARGET  | RESULT
[1]        [1,2,3,4]   TRUE
[1,2,3]    [1,2,3,4]   TRUE
[1,1,2,3]  [1,2,3,4]   FALSE
[1,2,3,4]  [1,2,3,4]   TRUE
[1,3]      [1,2,3,4]   TRUE
[1,11,5]   [1,2,3,4]   FALSE
[1,2,2,3]  [1,2,3,4]   FALSE

Himanshu Sharma · Accepted Answer

I would suggest not to do such heavy processing in mongo query as you can do the same task easily in any programming language. But, if you still need it in mongo, the following query can get you the expected output, provided both input and target arrays are sorted.

db.collection.aggregate([
    {
        $project:{
            "modifiedInput":{
                $reduce:{
                    "input":"$input",
                    "initialValue":{
                        "data":[],
                        "postfix":0,
                        "index":0,
                        "nextElem":{
                            $arrayElemAt:["$input",1]
                        }
                    },
                    "in":{
                        "data":{
                            $concatArrays:[
                                "$$value.data",
                                [
                                    {
                                        $concat:[
                                            {
                                                $toString:"$$this"
                                            },
                                            "-",
                                            {
                                                $toString:"$$value.postfix"
                                            }
                                        ]
                                    }
                                ]
                            ]
                        },
                        "postfix":{
                            $cond:[
                                {
                                    $eq:["$$this","$$value.nextElem"]
                                },
                                {
                                    $sum:["$$value.postfix",1]
                                },
                                0
                            ]
                        },
                        "nextElem": {
                            $arrayElemAt:["$input", { $sum : [ "$$value.index", 2] }]
                        },
                        "index":{
                            $sum:["$$value.index",1]
                        }
                    }
                }
            },
            "modifiedTarget":{
                $reduce:{
                    "input":"$target",
                    "initialValue":{
                        "data":[],
                        "postfix":0,
                        "index":0,
                        "nextElem":{
                            $arrayElemAt:["$target",1]
                        }
                    },
                    "in":{
                        "data":{
                            $concatArrays:[
                                "$$value.data",
                                [
                                    {
                                        $concat:[
                                            {
                                                $toString:"$$this"
                                            },
                                            "-",
                                            {
                                                $toString:"$$value.postfix"
                                            }
                                        ]
                                    }
                                ]
                            ]
                        },
                        "postfix":{
                            $cond:[
                                {
                                    $eq:["$$this","$$value.nextElem"]
                                },
                                {
                                    $sum:["$$value.postfix",1]
                                },
                                0
                            ]
                        },
                        "nextElem": {
                            $arrayElemAt:["$target", { $sum : [ "$$value.index", 2] }]
                        },
                        "index":{
                            $sum:["$$value.index",1]
                        }
                    }
                }
            }
        }
    },
    {
        $project:{
            "_id":0,
            "matched":{
                $eq:[
                    {
                        $size:{
                            $setDifference:["$modifiedInput.data","$modifiedTarget.data"]
                        }
                    },
                    0
                ]
            }
        }
    }
]).pretty()

Data set:

{
    "_id" : ObjectId("5d6e005db674d5c90f46d355"),
    "input" : [
        1
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d356"),
    "input" : [
        1,
        2,
        3
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d357"),
    "input" : [
        1,
        1,
        2,
        3
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d358"),
    "input" : [
        1,
        2,
        3,
        4
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d359"),
    "input" : [
        1,
        3
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d35a"),
    "input" : [
        1,
        5,
        11
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}
{
    "_id" : ObjectId("5d6e005db674d5c90f46d35b"),
    "input" : [
        1,
        2,
        2,
        3
    ],
    "target" : [
        1,
        2,
        3,
        4
    ]
}

Output:

{ "matched" : true }
{ "matched" : true }
{ "matched" : false }
{ "matched" : true }
{ "matched" : true }
{ "matched" : false }
{ "matched" : false }

Explanation: To avoid elimination of same values, we are adding the postfix counter to each. For example, [1,1,1,2,3,3,4,4] would become ["1-0","1-1","1-2","2-0","3-0","3-1","4-0","4-1","4-2"]. Afrer the conversion of both input and target arrays, the set difference is calculated. It's a match, if the size of set difference is zero.

Mongo query for an array is a subarray

Answers (2)

Related Questions