Reputation: 23
I have this entry structure:
{
"_id" : ObjectId("56de0178cf7970ac2a86fb23"),
"createdAt" : ISODate("2016-03-07T16:32:24.681-06:00"),
"updatedAt" : ISODate("2016-03-07T16:32:24.681-06:00"),
"yearTask" : 2016,
"startWeek" : 10,
"task" : "31231321",
"hours" : 312,
"project" : [
{
"Project" : "1000G",
"_id" : "565f193cea6493ce0acc9730"
}
],
"plannedWeeks" : [
{
"yearTask" : 2016,
"hours" : 3,
"weekNumber" : 10
},
{
"yearTask" : 2016,
"hours" : 3,
"weekNumber" : 11
},
{
"yearTask" : 2016,
"hours" : 3,
"weekNumber" : 12
},
{
"yearTask" : 2016,
"hours" : 3,
"weekNumber" : 13
},
{
"yearTask" : 2016,
"hours" : 3,
"weekNumber" : 14
}
],
}
So imagine that I have other entries and I need the total sum of hours for each week (weekNumber) and also I need to have this information group by project (in this case the name of the project is "Project"). The number of weeks are variable. The project field is an array, but only contains one project..
The output would look like this :
{
_id : {
"name" : "1000G",
"yearTask" : 2016,
"weeks" : [
{
"yearTask" : 2016,
"hours" : 34, <--Total sum for this project and week
"weekNumber" : 10
}
.... etc.
]
},
_id : {
"name" : "Project2",
"yearTask" : 2016,
"weeks" : [
{
"yearTask" : 2016,
"hours" : 584,<--Total sum for this project and week
"weekNumber" : 10
}
.... etc.
]
}
}
My current query only groups the planned weeks by project:
db.tasks.aggregate(
[
{ "$unwind": "$project" },
{$group : {
_id : {
name : "$project.Project",
yearTask : "$yearTask",
weeks : "$plannedWeeks",
},
/*"matches" : { "$sum" : "$plannedWeeks.hours" },*/
}},
{ $match : { "_id.yearTask": { $eq: 2016 } } },
]
)
I tried to use { "$unwind": "$plannedWeeks" }
, but I don't know how to sum the total of every week and then group them by project
Edited - My solution was :
[
{ "$match" : { "yearTask": 2016 } },
{ "$unwind": "$project" },
{ "$unwind": "$plannedWeeks" },
/*{ "$match" : { "yearTask": 2016 } },*/
{
"$group": {
"_id": {
"name": "$project.Project",
/*"yearTask": "$plannedWeeks.yearTask",*/
"weekYear": "$plannedWeeks.yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
},
"weeks": {
"$push": {
"yearTask": "$plannedWeeks.yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
}
},
"hours": { "$sum": "$plannedWeeks.hours" },
}
},
{ $sort : { "_id.weekYear" : 1,"_id.weekNumber" : 1, } },
{ "$group": {
"_id": {
"name": "$_id.name",
/*"yearTask": "$_id.yearTask",*/
},
"weeks": {
"$push": {
"yearTask": "$_id.weekYear",
"hours": "$hours",
"weekNumber": "$_id.weekNumber"
}
}
}},
]
Upvotes: 2
Views: 2651
Reputation: 50416
You want "two" $group
stages to first total up by "week" and then $push
the results into the rolled-up key for each stage.
Ideally with $arrayElemAt
from MongoDB 3.2:
db.tasks.aggregate([
{ "$unwind": "$plannedWeeks" },
{ "$group": {
"_id": {
"name": { "$arrayElemAt": [ "$project.Project", 0 ] },
"yearTask": "$yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
},
"hours": { "$sum": "$plannedWeeks.hours" }
}},
{ "$group": {
"_id": {
"name": "$_id.name",
"yearTask": "$_id.yearTask",
},
"weeks": {
"$push": {
"yearTask": "$_id.yearTask",
"hours": "$hours",
"weekNumber": "$_id.weekNumber"
}
}
}}
])
And of course since "project"
is an array of only one item, then there is no problem with using $unwind
there as well in earlier versions
db.tasks.aggregate([
{ "$unwind": "$plannedWeeks" },
{ "$unwind": "$project" },
{ "$group": {
"_id": {
"name": "$project.Project",
"yearTask": "$yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
},
"hours": { "$sum": "$plannedWeeks.hours" }
}},
{ "$group": {
"_id": {
"name": "$_id.name",
"yearTask": "$_id.yearTask",
},
"weeks": {
"$push": {
"yearTask": "$_id.yearTask",
"hours": "$hours",
"weekNumber": "$_id.weekNumber"
}
}
}}
])
At any rate, it's two $group
stages where the first does the sum and the next creates the array.
It's probably a good idea to reconsider the usage of an array for "project"
if it's only ever going to contain one element. Multiple arrays in documents can cause problems if you expect some sort of correlation between the data contained, and that is generally better expressed in a single array instead, or as just a base property, even nested.
As always, $match
first in aggregation pipelines if you actually intend to filter document content by conditions in results.
Upvotes: 1
Reputation: 103425
Consider running the following aggregation pipeline to get the correct result
pipeline = [
{ "$match" : { "plannedWeeks.yearTask": 2016 } },
{ "$unwind": "$project" },
{ "$unwind": "$plannedWeeks" },
{ "$match" : { "plannedWeeks.yearTask": 2016 } },
{
"$group": {
"_id": {
"name": "$project.Project",
"yearTask": "$plannedWeeks.yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
},
"weeks": {
"$push": {
"yearTask": "$plannedWeeks.yearTask",
"weekNumber": "$plannedWeeks.weekNumber"
}
},
"totalHours": { "$sum": "$plannedWeeks.hours" },
}
}
]
db.tasks.aggregate(pipeline)
Upvotes: 0