Reputation: 43646
I have a collection with elements like this:
{
"_id": {
"$oid": "56d61033a378eccde8a8354f"
},
"id": "10021-2015-ENFO",
"certificate_number": 9278806,
"business_name": "ATLIXCO DELI GROCERY INC.",
"date": "Feb 20 2015",
"result": "No Violation Issued",
"sector": "Cigarette Retail Dealer - 127",
"address": {
"city": "RIDGEWOOD",
"zip": 11385,
"street": "MENAHAN ST",
"number": 1712
}
}
What I want is to add _id
for each address
. This is one time operation and I am doing it for researching/testing purpose.
I think that I can split the task into the following steps:
_id
for each record (as I have read will add it if not specified)address
fields in order to insert the corresponding _id
for each embedded document in the source collectionI select the documents like like this:
db.ci.find({}, {"address":1, "_id":0});
but I fail to make a distinct from it using the Distinct
function or foreach
. I try to use aggregate
, too but did not make it work.
Could anyone give me some tips?
I am using Ubuntu 20.04, mongodb 4.2.7, vs code with mongodb extension.
Upvotes: 0
Views: 115
Reputation: 28356
You can get a unique integer assigned to each in a single aggregation by using $facet
to keep the original documents while using $addToSet
to get the list of unique addresses, and $indexOfArray
to assign the value to each document:
db.collection.aggregate([
{$facet: {
docs: [{$match: {}}],
addresses: [
{$group: {
_id: null,
address: {$addToSet: "$address"}
}}
]
}},
{$unwind: "$docs"},
{$unwind: "$addresses"},
{$addFields: {
"address.id": {
$indexOfArray: [
"$addresses.address",
"$docs.address"
]
}
}},
{$replaceRoot:{newRoot:"$docs"}},
{$out:"new_collection"}
])
If you prefer an ObjectId, you have the right idea, one aggration to output the unique addresses to a temporary collection so each is auto-assigned an _id, then a second aggregation to embed those _id values in the original documents. For this example I gathered the _id of the original document to simplify the later lookup.
db.collection.aggregate([
{$group:{
_id:"$address",
ids:{$push:"$_id"}
}},
{$project:{
address:"$_id",
ids:1,
_id:0
}},
{$out: "temp_address_collection"}
])
db.collection.aggregate([
{$lookup:{
from:"temp_address_collection",
localField:"_id",
foreignField:"ids",
as: "matched"
}},
{$addFields:{matched:{$arrayElemAt:["$matched",0]}}},
{$addFields:{"$address.id": "$matched._id"}},
{$project:{matched:0}},
{$out:"new_collection"}
])
Upvotes: 1