Reputation: 3839
I'm looking for a query that acts as $setIsSubset
, except accounting for duplicate values.
For example, [1,1,2,3]
is a subset of [1,2,3,4]
, because sets don't have duplicate values.
How can I write a query such that [1,1,2,3]
is not a subset of [1,2,3,4]
?
An example of expected outputs:
INPUT | TARGET | RESULT
[1] [1,2,3,4] TRUE
[1,2,3] [1,2,3,4] TRUE
[1,1,2,3] [1,2,3,4] FALSE
[1,2,3,4] [1,2,3,4] TRUE
[1,3] [1,2,3,4] TRUE
[1,11,5] [1,2,3,4] FALSE
[1,2,2,3] [1,2,3,4] FALSE
Upvotes: 2
Views: 678
Reputation: 3010
I would suggest not to do such heavy processing in mongo query as you can do the same task easily in any programming language. But, if you still need it in mongo, the following query can get you the expected output, provided both input and target arrays are sorted.
db.collection.aggregate([
{
$project:{
"modifiedInput":{
$reduce:{
"input":"$input",
"initialValue":{
"data":[],
"postfix":0,
"index":0,
"nextElem":{
$arrayElemAt:["$input",1]
}
},
"in":{
"data":{
$concatArrays:[
"$$value.data",
[
{
$concat:[
{
$toString:"$$this"
},
"-",
{
$toString:"$$value.postfix"
}
]
}
]
]
},
"postfix":{
$cond:[
{
$eq:["$$this","$$value.nextElem"]
},
{
$sum:["$$value.postfix",1]
},
0
]
},
"nextElem": {
$arrayElemAt:["$input", { $sum : [ "$$value.index", 2] }]
},
"index":{
$sum:["$$value.index",1]
}
}
}
},
"modifiedTarget":{
$reduce:{
"input":"$target",
"initialValue":{
"data":[],
"postfix":0,
"index":0,
"nextElem":{
$arrayElemAt:["$target",1]
}
},
"in":{
"data":{
$concatArrays:[
"$$value.data",
[
{
$concat:[
{
$toString:"$$this"
},
"-",
{
$toString:"$$value.postfix"
}
]
}
]
]
},
"postfix":{
$cond:[
{
$eq:["$$this","$$value.nextElem"]
},
{
$sum:["$$value.postfix",1]
},
0
]
},
"nextElem": {
$arrayElemAt:["$target", { $sum : [ "$$value.index", 2] }]
},
"index":{
$sum:["$$value.index",1]
}
}
}
}
}
},
{
$project:{
"_id":0,
"matched":{
$eq:[
{
$size:{
$setDifference:["$modifiedInput.data","$modifiedTarget.data"]
}
},
0
]
}
}
}
]).pretty()
Data set:
{
"_id" : ObjectId("5d6e005db674d5c90f46d355"),
"input" : [
1
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d356"),
"input" : [
1,
2,
3
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d357"),
"input" : [
1,
1,
2,
3
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d358"),
"input" : [
1,
2,
3,
4
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d359"),
"input" : [
1,
3
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d35a"),
"input" : [
1,
5,
11
],
"target" : [
1,
2,
3,
4
]
}
{
"_id" : ObjectId("5d6e005db674d5c90f46d35b"),
"input" : [
1,
2,
2,
3
],
"target" : [
1,
2,
3,
4
]
}
Output:
{ "matched" : true }
{ "matched" : true }
{ "matched" : false }
{ "matched" : true }
{ "matched" : true }
{ "matched" : false }
{ "matched" : false }
Explanation: To avoid elimination of same values, we are adding the postfix counter to each. For example, [1,1,1,2,3,3,4,4] would become ["1-0","1-1","1-2","2-0","3-0","3-1","4-0","4-1","4-2"]. Afrer the conversion of both input and target arrays, the set difference is calculated. It's a match, if the size of set difference is zero.
Upvotes: 2
Reputation: 49945
You can try below aggregation:
let input = [1,2,3];
let inputSize = 3;
db.collection.aggregate([
{
$project: {
uniqueTarget: { $setUnion: [ "$target" ] }
}
},
{
$addFields: {
filtered: {
$reduce: {
input: input,
initialValue: "$uniqueTarget",
in: {
$filter: {
input: "$$value",
as: "current",
cond: { $ne: [ "$$this", "$$current" ] }
}
}
}
}
}
},
{
$project: {
result: {
$eq: [
{ $size: "$filtered" },
{ $subtract: [ { $size: "$uniqueTarget" }, inputSize ] }
]
}
}
}
])
It starts with $setUnion to ensure there are no duplicates in target
array. Then you can run $reduce to iterate through input
and remove currently processed element from target. Every iteration should remove single element so expected $size of filtered
is equal $size of uniqueTarget
- inputSize
Upvotes: 1