crunk1
crunk1

Reputation: 2818

MongoDB multikey index on array of embedded document - sort/range comparisons and min/maxkey

Simplified example and question

In effort to be as simple as possible, I'll just jump into an example:

Collection foo with multikey index {searchTags: 1}:

{_id: 1, searchTags: [{bar: "BAR"}]}
{_id: 2, searchTags: [{baz: "BAZ"}]}

I'm trying to get all embedded documents with the baz field key (without using $exists - I can try to explain why later). Why does

{searchTags: {$elemMatch: { $gte: { baz: MinKey() }, $lte: { baz: MaxKey() }}}}

return BOTH documents (not preferred), but

{searchTags: {$elemMatch: { $gte: { baz: "" }, $lte: { baz: MaxKey() }}}}

only returns {_id: 2, searchTags: [{baz: "BAZ"}]} (preferred)?

Sidenote: quick schema details to maybe avoid the "what are you doing?" questions

Upvotes: 0

Views: 424

Answers (1)

Joe
Joe

Reputation: 28316

  • The index on {searchTags: 1} is not being used to service that query

When indexing an array, each element of the array is included as a value in the index. Any element that happens to be a document is indexes as an entire document, it is not broken down by field. I expect if you were to run that command using explain, it would show it is a collection scan.

  • MinKey is less than any other possible key value, including null, missing, and undefined. Likewise for MaxKey

A quick demonstration:

> db.collection.insertMany([
 {_id: 1, searchTags: [{bar: "BAR"}]},
 {_id: 2, searchTags: [{baz: "BAZ"}]} 
])
{ "acknowledged" : true, "insertedIds" : [ 1, 2 ] }

> db.collection.aggregate([
  {$unwind: "$searchTags"},
  {$addFields: {
      baztype: {$type: "$searchTags.baz"},
      bazmin: {$gt: ["$searchTags.baz", MinKey()]},
      bazmax: {$lt: ["$searchTages.baz", MaxKey()]},
      bazstr: {$gt: ["$searchTags.baz", ""]}
   }}
])
{ "_id" : 1, "searchTags" : { "bar" : "BAR" }, "baztype" : "missing", "bazmin" : true, "bazmax" : true, "bazstr" : false }
{ "_id" : 2, "searchTags" : { "baz" : "BAZ" }, "baztype" : "string", "bazmin" : true, "bazmax" : true, "bazstr" : true }

The first query you show matches because every possible value, including undefined, is both greater than MinKey and less than MaxKey.

The second query you show doesn't match because the operators are case-sensitive, so when a string value is provided, only string values will match, and this does not include undefined.

  • To find documents that contain a field, regardless of type, use the $exists operator:
db.collection.find({"searchTags.baz":{$exists:true}})

Upvotes: 0

Related Questions