Reputation: 91
I am working to develop a pipeline in MongoDB that adds up all of the number_of_employees in each state from companies, and it prints out the information for each state.
The basics of the collection structure is demonstrated below:
"name" : "AdventNet",
"number_of_employees" : 600,
"offices" : [
{
"description" : "Headquarters",
"address1" : "4900 Hopyard Rd.",
"address2" : "Suite 310",
"zip_code" : "94588",
"city" : "Pleasanton",
"state_code" : "CA",
"country_code" : "USA",
"latitude" : 37.692934,
"longitude" : -121.904945
},
{
"description" : "",
"address1" : "270 Lafayette Street",
"address2" : "Suite 505",
"zip_code" : "10012",
"city" : "New York",
"state_code" : "NY",
"country_code" : "USA",
"latitude" : 40.7237306,
"longitude" : -73.9964312
}
],
I have tried a couple of different methods to get my most recent pipeline code, but I am still struggling with getting the summation to work correctly. I tried different grouping structures, but I kept getting an error.
db.research.aggregate( [
{$match : {"offices.country_code" : "USA"} },
{$project: {State : "$offices.state_code"}},
{$unwind: "$State"},
{ $group : {"_id" : "$State", "total_employees": {$sum : "$number_of_
employees"} }},
{$project: {_id : 0, State: "$_id", total_employees: 1}}
])
The expected results:
{ "total_employees" : 1500, "State" : "SD" }
{ "total_employees" : 350, "State" : "WV" }
...
The actual results:
{ "total_employees" : 0, "State" : "SD" }
{ "total_employees" : 0, "State" : "WV" }
...
Upvotes: 3
Views: 189
Reputation: 5679
i believe this is the pipeline you're looking for:
db.research.aggregate([
{
"$match": {
"offices.country_code": "USA"
}
},
{
"$unwind": "$offices"
},
{
"$project": {
"number_of_employees": "$number_of_employees",
"state_code": "$offices.state_code",
"_id": 0
}
},
{
"$group": {
"_id": "$state_code",
"total_employees": {
"$sum": "$number_of_employees"
}
}
},
{
"$project": {
"Total_Employees": "$total_employees",
"State": "$_id",
"_id": 0
}
}
])
you can see it working here: https://mongoplayground.net/p/XUA_e9Y4Fq_
also in case anybody's interested, here's the c# code that generated the above query:
using MongoDB.Entities;
using System.Linq;
namespace StackOverflow
{
public class Program
{
public class company : Entity
{
public int number_of_employees { get; set; }
public office[] offices { get; set; }
}
public class office
{
public string state_code { get; set; }
public string country_code { get; set; }
}
private static void Main(string[] args)
{
new DB("test");
(new[]
{
new company{
number_of_employees = 100,
offices = new[]
{
new office
{
state_code = "NY",
country_code = "USA"
}
}
},
new company{
number_of_employees = 100,
offices = new[]
{
new office
{
state_code = "NY",
country_code = "USA"
},
new office
{
state_code = "LA",
country_code = "USA"
}
}
}
}).Save();
var result = DB.Queryable<company>()
.Where(c => c.offices.Any(o => o.country_code == "USA"))
.SelectMany(c => c.offices, (c, o) => new { c.number_of_employees, o.state_code })
.GroupBy(c => c.state_code)
.Select(g => new
{
Total_Employees = g.Sum(c => c.number_of_employees),
State = g.Key
})
.ToList();
}
}
}
Upvotes: 1
Reputation: 1856
If i got right you need total number of employees state wise.
then you just need to change total_employees: { $sum: 1 }
in your group pipeline.
db.research.aggregate([
{
$unwind: "$offices"
},
{
$match: {
'offices.country_code': "USA"
}
},
{
$group: {
_id: "$offices.state_code",
total_employees: { $sum: 1 },
}
},
{
$project: {
_id: 0,
State: "$_id",
total_employees: 1
}
}
])
Upvotes: 0