Tracy Harrison
Tracy Harrison

Reputation: 91

How do I calculate the sum of values after using an unwind to create separate documents?

I am working to develop a pipeline in MongoDB that adds up all of the number_of_employees in each state from companies, and it prints out the information for each state.

The basics of the collection structure is demonstrated below:

"name" : "AdventNet",
"number_of_employees" : 600,
"offices" : [
       {
           "description" : "Headquarters",
           "address1" : "4900 Hopyard Rd.",
           "address2" : "Suite 310",
           "zip_code" : "94588",
           "city" : "Pleasanton",
           "state_code" : "CA",
           "country_code" : "USA",
           "latitude" : 37.692934,
           "longitude" : -121.904945
        },
        {
            "description" : "",
            "address1" : "270 Lafayette Street",
            "address2" : "Suite 505",
            "zip_code" : "10012",
            "city" : "New York",
            "state_code" : "NY",
            "country_code" : "USA",
            "latitude" : 40.7237306,
            "longitude" : -73.9964312
        }
],

I have tried a couple of different methods to get my most recent pipeline code, but I am still struggling with getting the summation to work correctly. I tried different grouping structures, but I kept getting an error.

db.research.aggregate( [
    {$match : {"offices.country_code" : "USA"} }, 
    {$project: {State : "$offices.state_code"}}, 
    {$unwind: "$State"}, 
    { $group : {"_id" : "$State", "total_employees": {$sum : "$number_of_
employees"} }}, 
    {$project: {_id : 0, State: "$_id", total_employees: 1}}
])

The expected results:

{ "total_employees" : 1500, "State" : "SD" }
{ "total_employees" : 350, "State" : "WV" }
...

The actual results:

{ "total_employees" : 0, "State" : "SD" }
{ "total_employees" : 0, "State" : "WV" }
...

Upvotes: 3

Views: 189

Answers (2)

Dĵ ΝιΓΞΗΛψΚ
Dĵ ΝιΓΞΗΛψΚ

Reputation: 5679

i believe this is the pipeline you're looking for:

db.research.aggregate([
    {
        "$match": {
            "offices.country_code": "USA"
        }
    },
    {
        "$unwind": "$offices"
    },
    {
        "$project": {
            "number_of_employees": "$number_of_employees",
            "state_code": "$offices.state_code",
            "_id": 0
        }
    },
    {
        "$group": {
            "_id": "$state_code",
            "total_employees": {
                "$sum": "$number_of_employees"
            }
        }
    },
    {
        "$project": {
            "Total_Employees": "$total_employees",
            "State": "$_id",
            "_id": 0
        }
    }
])

you can see it working here: https://mongoplayground.net/p/XUA_e9Y4Fq_

also in case anybody's interested, here's the c# code that generated the above query:

using MongoDB.Entities;
using System.Linq;

namespace StackOverflow
{
    public class Program
    {
        public class company : Entity
        {
            public int number_of_employees { get; set; }
            public office[] offices { get; set; }
        }


        public class office
        {
            public string state_code { get; set; }
            public string country_code { get; set; }
        }

        private static void Main(string[] args)
        {
            new DB("test");

            (new[]
            {
                new company{
                    number_of_employees = 100,
                    offices = new[]
                    {
                        new office
                        {
                            state_code = "NY",
                            country_code = "USA"
                        }
                    }
                },
                new company{
                    number_of_employees = 100,
                    offices = new[]
                    {
                        new office
                        {
                            state_code = "NY",
                            country_code = "USA"
                        },
                        new office
                        {
                            state_code = "LA",
                            country_code = "USA"
                        }
                    }
                }
            }).Save();

            var result = DB.Queryable<company>()
                           .Where(c => c.offices.Any(o => o.country_code == "USA"))
                           .SelectMany(c => c.offices, (c, o) => new { c.number_of_employees, o.state_code })
                           .GroupBy(c => c.state_code)
                           .Select(g => new
                           {
                               Total_Employees = g.Sum(c => c.number_of_employees),
                               State = g.Key
                           })
                           .ToList();
        }


    }
}

Upvotes: 1

Shivam Mishra
Shivam Mishra

Reputation: 1856

If i got right you need total number of employees state wise. then you just need to change total_employees: { $sum: 1 } in your group pipeline.

db.research.aggregate([
    {
        $unwind: "$offices"
    },
    {
        $match: {
            'offices.country_code': "USA"
        }
    },
    {
        $group: {
            _id: "$offices.state_code",
            total_employees: { $sum: 1 },
        }
    },
    { 
        $project: { 
            _id: 0, 
            State: "$_id", 
            total_employees: 1
        } 
    }
])

Upvotes: 0

Related Questions