Akhilesh
Akhilesh

Reputation: 55

Filter and aggregation on nested data mongodb

{
    "_id" : ObjectId("5eb1116ed99e7c68395d479c"),
    "sheets" : [
        {
            "Sheet1" : [
                {
                    "headers" : [
                        "name,mobile"
                    ],
                    "data_count" : 10
                }
            ]
        }
    ],

    "name" : "2.csv"
},

{
    "_id" : ObjectId("5eb1116ed99e7c68395d480c"),
    "sheets" : [
        {
            "Sheet1" : [
                {
                    "headers" : [
                        "name,mobile,mobile1,mobile2,email1,email2"
                    ],
                    "data_count" : 30
                }
            ],
                        "Sheet2" : [
                {
                    "headers" : [
                        "name,mobile,mobile1"
                    ],
                    "data_count" : 20
                }
            ]
        }
    ],
    "name" : "1.csv"
}

I have multiple document in a collection so I will pass id and sheet name e.g. Sheet2 and I want to get the sum of data count for all the input id's and sheets

so suppose in above case my input is [{file_id:"5eb1116ed99e7c68395d479c", sheet_name:"Sheet1"}, {file_id:"5eb1116ed99e7c68395d480c", sheet_name:"Sheet2"}]

so output should be data_count is 30

Upvotes: 0

Views: 32

Answers (1)

Fayad
Fayad

Reputation: 120

You can use the mongodb aggregation pipeline to achieve this. I am not sure why you have your sheets fields as an array if it contains only one object that contains all the sheets. I would suggest something as follows

Sample Doc

{
    _id    : ObjectId('5eb1116ed99e7c68395d480c'),
    sheets : [
        {
            name : 'Sheet 1',
            data : {
                headers    : [ 'name,mobile,mobile1,mobile2,email1,email2' ],
                data_count : 30
            }
        },
        {
            name : 'Sheet 2',
            data : {
                headers    : [ 'name,mobile,mobile1' ],
                data_count : 20
            }
        }
    ]
}

Regardless of that, you can get the intended result you want with the Schema you have by applying multiple $match and $unwind on the documents until you get the data separated into an _id field and a sheets field that contains only one value. Then you can easily match the documents you want by using a $match with an $or.

Query

db.collection.aggregate([
        {
            $match : {
                _id : {
                    $in : [ ObjectId('5eb1116ed99e7c68395d479c'), ObjectId('5eb1116ed99e7c68395d480c') ]
                }
            }
        },
        {
            $unwind : '$sheets'
        },
        {
            $project : {
                _id    : 1,
                sheets : {
                    $objectToArray : '$sheets'
                }
            }
        },
        {
            $unwind : '$sheets'
        },
        {
            $unwind : '$sheets.v'
        },
        {
            $match : {
                $or : [
                    {
                        _id        : ObjectId('5eb1116ed99e7c68395d479c'),
                        'sheets.k' : 'Sheet1'
                    },
                    {
                        _id        : ObjectId('5eb1116ed99e7c68395d480c'),
                        'sheets.k' : 'Sheet2'
                    }
                ]
            }
        },
        {
            $group : {
                _id        : null,
                data_count : {
                    $sum : '$sheets.v.data_count'
                }
            }
        }
    ]);

Test : Test aggregation pipeline here : mongoplayground

Output

[
  {
    "_id": null,
    "data_count": 30
  }
]

Upvotes: 1

Related Questions