mongoDB elemMatch ignores other match conditions in the aggregate pipeline

Question

I have a flexible mongodb collection for capturing dynamic workflow field and values optimized for searching with different filters

The model is

{
  "team_id" : String,
  "template_id" : String,
  "meta" : [
    {
        "k" : String,
        "v" : String/Number
    }
  ]
}

There are millions of records distributed across various team_id & template_id for each workflow record. A sample record set would be like

[
  {
    "team_id": "wasp",
    "template_id": "order_booking",
    "document_id": "d1",
    "meta": [
      {
        "k": "assignee",
        "v": "u1"
      },
      {
        "k": "owner",
        "v": "u1"
      },
      {
        "k": "total_amount",
        "v": 1000
      },
      {
        "k": "pincode",
        "v": "560001"
      },
      {
        "k": "state",
        "v": "KA"
      },
    ]
  },
  {
    "team_id": "bee",
    "template_id": "order_booking",
    "document_id": "d900",
    "meta": [
      {
        "k": "assignee",
        "v": "a1"
      },
      {
        "k": "owner",
        "v": "a1"
      },
      {
        "k": "total_amount",
        "v": 19999
      },
      {
        "k": "pincode",
        "v": "560001"
      },
      {
        "k": "state",
        "v": "KA"
      },
    ]
  }
]

I am looking at querying the data from this collection with multiple filters like Get all records where team is wasp, template is order_booking assignee is u1 with pincode as 10011

The above query translates to

db.getCollection("xyx").aggregate(
    [
        {
            "$match": {
                "team_id": "wasp",
                "template_id": "order_booking",
                "meta": {
                    "$all": [
                        {
                            "$elemMatch": {
                                "k": "assignee",
                                "v": "u1"
                            }
                        },
                        {
                            "$elemMatch": {
                                "k": "pincode",
                                "v": "560001"
                            }
                        }
                    ]

                }
            }
        },
    ]
)

The issue with the using the elemMatch in the aggregate expression is it scans all the documents in the collection irrespective the match criteria given initially for the team_id and template_id

How can I ensure that the initial match criteria is honoured and elemMatch criteria looks only within that subset?

Alex Blex · Accepted Answer

To answer the question literally - you can do 2 matches in a row:

db.getCollection("xyx").aggregate(
    [
        {
            "$match": {
                "team_id": "wasp",
                "template_id": "order_booking"
        }, {
            "$match": {
                "meta": {
                    "$all": [
                        {
                            "$elemMatch": {
                                "k": "assignee",
                                "v": "u1"
                            }
                        },
                        {
                            "$elemMatch": {
                                "k": "pincode",
                                "v": "560001"
                            }
                        }
                    ]

                }
            }
        },
    ]
)

The second $match will test only documents that passed through the first one.

But, if you are optimising performance - add a compound index instead:

{
            "team_id": 1,
            "template_id": 1
}

It will use in-memory b-tree to filter documents by team_id and template_id and then fetch only matching documents to evaluate "meta".

mongoDB elemMatch ignores other match conditions in the aggregate pipeline

Answers (1)

Related Questions