Reputation: 1992
I have the unfortunate problem of a data structure changing on me around six months ago. Therefore, I have a document that used to look like...
{
fruits: [
{
id: 123
},
{
id: 456
}
]
}
(Important to note that the id
is not the ObjectId
BSON type, it's just a random series of characters generated by the client side).
...but now has the id
key changed to.
{
fruits: [
{
fruit_id: 'xxx'
},
{
fruit_id: 'yyy'
}
]
}
So, I'm trying to do a $project
to have both id
and fruit_id
changed to something generic like general_id
, so that I can continue with another aggregation like $group
and just reference the one field
I've tried something along the lines of:
[
$unwind: {
path: '$fruits'
},
$project: {
general_id: {
$cond: {
if: {
'fruits.fruit_id': {
$type: ['string']
}
},
then: '$fruits.fruit_id',
else: '$fruits.id'
}
}
}
]
Upvotes: 1
Views: 441
Reputation: 151220
It really depends on what you are after here, but for a general case of knowing two possibilities it's probably better to use $ifNull
to return the value for the field if present, otherwise return the other field's value.
Adding a little more data for demonstration since you probably don't want to loose anything else in the array elements:
{
_id: 1,
fruits: [
{
id: 123,
data: 1
},
{
id: 456,
data: 2
}
]
},
{
_id: 2,
fruits: [
{
fruit_id: 'xxx',
data: 1
},
{
fruit_id: 'yyy',
data: 2
}
]
},
{
_id: 3,
fruits: [
{
fruit_id: 'xxx',
data: 1,
},
{
fruit_id: 'yyy',
data: 2
},
{
id: 123,
data: 3
},
{
id: 456,
data: 4
}
]
}
Then you can either do the process using $unwind
as the first step, which does make path naming easier and especially with $addFields
instead of $project
:
Model.aggregate([
{ "$unwind": "$fruits" },
{ "$addFields": {
"fruits": {
"id": "$$REMOVE",
"fruit_id": "$$REMOVE",
"general_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] }
}
}}
])
That uses $$REMOVE
from MongoDB 3.6 and above ( which should be the minimal version you are using ) in order to "remove" the fields you don't want. You don't need to do that and can just declare everything you actually want with $project
if you don't have support.
Then of course has the alternate with an $ifNull
expression.
This gives results on that data like:
{ "_id" : 1, "fruits" : { "data" : 1, "general_id" : 123 } }
{ "_id" : 1, "fruits" : { "data" : 2, "general_id" : 456 } }
{ "_id" : 2, "fruits" : { "data" : 1, "general_id" : "xxx" } }
{ "_id" : 2, "fruits" : { "data" : 2, "general_id" : "yyy" } }
{ "_id" : 3, "fruits" : { "data" : 1, "general_id" : "xxx" } }
{ "_id" : 3, "fruits" : { "data" : 2, "general_id" : "yyy" } }
{ "_id" : 3, "fruits" : { "data" : 3, "general_id" : 123 } }
{ "_id" : 3, "fruits" : { "data" : 4, "general_id" : 456 } }
If you wanted to $group
on that value, then there's no need for an intermediate "project" of any kind. Just do the $ifNull
directly in that stage:
Model.aggregate([
{ "$unwind": "$fruits" },
{ "$group": {
"_id": { "$ifNull": [ "$fruits.id", "$fruits.fruit_id" ] },
"count": { "$sum": 1 }
}}
])
And output:
{ "_id" : "yyy", "count" : 2 }
{ "_id" : "xxx", "count" : 2 }
{ "_id" : 456, "count" : 2 }
{ "_id" : 123, "count" : 2 }
Or if you actually did not need to $unwind
the array for other purposes, you can use $map
and some other manipulation with $objectToArray
and $arrayToObject
:
Model.aggregate([
{ "$addFields": {
"fruits": {
"$map": {
"input": "$fruits",
"in": {
"$mergeObjects": [
{ "$arrayToObject": {
"$filter": {
"input": { "$objectToArray": "$$this" },
"cond": { "$not": { "$in": [ "$$this.k", ["fruit_id","id"] ] } }
}
}},
{
"general_id": { "$ifNull": ["$$this.id","$$this.fruit_id"] }
}
]
}
}
}
}}
])
Which returns results like:
{
"_id" : 1,
"fruits" : [
{
"data" : 1,
"general_id" : 123
},
{
"data" : 2,
"general_id" : 456
}
]
}
{
"_id" : 2,
"fruits" : [
{
"data" : 1,
"general_id" : "xxx"
},
{
"data" : 2,
"general_id" : "yyy"
}
]
}
{
"_id" : 3,
"fruits" : [
{
"data" : 1,
"general_id" : "xxx"
},
{
"data" : 2,
"general_id" : "yyy"
},
{
"data" : 3,
"general_id" : 123
},
{
"data" : 4,
"general_id" : 456
}
]
}
Adding an $unwind
after that returns just the same as before. But the more complex operations are probably better suited to where you want to keep this as an array.
This time we removed the id
and fruit_id
by converting each array element into an array of "key/value" pairs via $objectToArray
. We then $filter
the array based on those "k"
values, which are the names of the fields. The $arrayToObject
makes this an object again, with all other content except those fields.
The $mergeObjects
is to $map
what $addFields
is to the root "document", in that it takes multiple objects and "merges" them together. Thus the "filtered" object as described before, and the new object with only the general_id
key and it's value translated from whichever field was present.
As a final note, $ifNull
works better than $cond
where you have just two values, but neither is actually that great if there is a larger possible list. You can nest $cond
expressions or even use $switch
, but really it's probably best to filter content out via the $objectToArray
as seen before:
var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];
Model.aggregate([
{ "$unwind": "$fruits" },
{ "$group": {
"_id": {
"$arrayElemAt": [
{ "$map": {
"input": {
"$filter": {
"input": { "$objectToArray": "$fruits" },
"cond": { "$in": [ "$$this.k", valid_names ] }
}
},
"in": "$$this.v"
}},
0
]
},
"count": { "$sum": 1 }
}}
])
That usually makes the most sense, otherwise for working with such a list in dynamic way you end up building aggregation pipeline stages in code, such as with using $switch
would be:
var valid_names = [ "id", "fruit_id", "apple_id", "orange_id" ];
var branches = valid_names.map(name =>
({
"case": { "$gt": [`$fruits.${name}`, null ] },
"then": `$fruits.${name}`
})
)
Model.aggregate([
{ "$unwind": "$fruits" },
{ "$group": {
"_id": { "$switch": { branches, "default": null } },
"count": { "$sum": 1 }
}}
])
Which looks cleaner in your code, but actually sends a much larger pipeline in BSON:
[
{ "$unwind" : "$fruits" },
{ "$group" : {
"_id" : {
"$switch" : {
"branches" : [
{
"case" : { "$gt" : [ "$fruits.id", null ] },
"then" : "$fruits.id"
},
{
"case" : { "$gt" : [ "$fruits.fruit_id", null ] },
"then" : "$fruits.fruit_id"
},
{
"case" : { "$gt" : [ "$fruits.apple_id", null ] },
"then" : "$fruits.apple_id"
},
{
"case" : { "$gt" : [ "$fruits.orange_id", null ] },
"then" : "$fruits.orange_id"
}
],
"default" : null
}
},
"count" : { "$sum" : 1 }
}}
]
Upvotes: 1