Jaykay
Jaykay

Reputation: 23

MongoDB find lowest missing value

I have a collection in MongoDB that looks something like:

{
 "foo": "something",
 "tag": 0,
},
{
 "foo": "bar",
 "tag": 1,
},
{
 "foo": "hello",
 "tag": 0,
},
{
 "foo": "world",
 "tag": 3,
}

If we consider this example, there are entries in the collection with tag of value 0, 1 or 3 and these aren't unique values, tag value can be repeated. My goal is to find that 2 is missing. Is there a way to do this with a query?

Upvotes: 1

Views: 69

Answers (1)

Takis
Takis

Reputation: 8705

Query1

  • in the upcoming mongodb 5.2 we will have sort on arrays that could do this query easier without set operation but this will be ok also
  • group and find the min,max and all the values
  • take the range(max-min)
  • the missing are (setDifference range_above tags)
  • and from them you take only the smallest => 2

Test code here

aggregate(
[{"$group":
  {"_id":null,
   "min":{"$min":"$tag"},
   "max":{"$max":"$tag"},
   "tags":{"$addToSet":"$tag"}}},
 {"$project":
  {"_id":0,
   "missing":
   {"$min":
    {"$setDifference":
     [{"$range":[0, {"$subtract":["$max", "$min"]}]}, "$tags"]}}}}])

Query2

  • in Mongodb 5 (the current version) we can use also $setWindowFields
  • sort by tag, add the dense-rank(same values=same rank), and the min
  • then find the difference of tag-min
  • and then filter those that this difference < rank
  • and find the max of them (max of the tag that are ok)
  • increase 1 to find the one missing

*test it before using it to be sure, i tested it 3-4 times seemed ok, for big collection if you have many different tags, this is better i think. (the above addtoset can cause memory problems)

Test code here

aggregate(
[{"$setWindowFields":
  {"output":{"rank":{"$denseRank":{}}, "min":{"$first":"$tag"}},
   "sortBy":{"tag":1}}},
 {"$set":{"difference":{"$subtract":["$tag", "$min"]}}},
 {"$match":{"$expr":{"$lt":["$difference", "$rank"]}}},
 {"$group":{"_id":null, "last":{"$max":"$tag"}}},
 {"$project":{"_id":0, "missing":{"$add":["$last", 1]}}}])

Upvotes: 4

Related Questions