Reputation: 563
Some fields in my mongoDB documents look like this:
{
...
Countries: [["Spain", "France"]]
...
}
Or this:
{
...
Countries: [["Spain"],["Russia", "Egypt"]]
...
}
What I want to do is to turn [["Spain", "France"]]
into ["Spain", "France"]
and [["Spain"],["Russia", "Egypt"]]
into ["Spain", "Russia", "Egypt"]
, similar to using the flatten
method in Ruby.
Is there a way to flatten arrays in mongoDB? I need to flatten arrays in all documents in entire collection, not just a single doc, if that matters, also, the values and their quantity in arrays varies between documents.
I am also using Ruby as a driver for mongo, so a method using a Ruby driver would be also useful to me.
Upvotes: 14
Views: 13123
Reputation: 526
since the version 5 of }, we have an alternative using the $function operator like this:
db.collection.aggregate([
{
$addFields: {
Countries: {
$function: {
body: function(items) {
return [].concat(...items)
},
args: [ "$Countries" ],
lang: "js"
}
}
}
}
])
the ideal solution would be to be able to use $concatArrays but unfortunately this operator is not able to take into account an array field (at the time of this writing). There's a ticket in the mongodb backlog about this limitation on array fields : https://jira.mongodb.org/browse/SERVER-31991 . If you are interested by the flexibility brought by this enhancement please vote. This ticket has been here since 2019 so it seems to need help to be prioritized ;-)
Upvotes: 0
Reputation: 5741
In Mongo 3.4+ you can use $reduce
to flatten 2d arrays.
db.collection.aggregate(
[
{
$project: {
"countries": {
$reduce: {
input: '$Countries',
initialValue: [],
in: {$concatArrays: ['$$value', '$$this']}
}
}
}
}
]
)
Docs: https://docs.mongodb.com/manual/reference/operator/aggregation/reduce/
Upvotes: 24
Reputation: 19700
You need perform an aggregation operation with two unwind stages and a single group stage. The basic rule being you unwind as many times as the level of nest depth. Here the level of nesting is 2, so we unwind two times.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"}}}
])
The first $unwind
stage produces the result:
{
"_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
"Countries" : [
"Spain",
"France"
]
}
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Spain"
]
}
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Russia",
"Egypt"
]
}
The second $unwind
stage further flattens the Countries
array:
{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"), "Countries" : "France" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Spain" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Russia" }
{ "_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"), "Countries" : "Egypt" }
Now the final $group
stage groups the records based on the _id
,and accumulates the country names in a single array.
{
"_id" : ObjectId("54a32e4ec2eaf05fc77a5ea5"),
"Countries" : [
"Spain",
"Russia",
"Egypt"
]
}
{
"_id" : ObjectId("54a32e0fc2eaf05fc77a5ea4"),
"Countries" : [
"Spain",
"France"
]
}
If you wish to keep other fields in the document then you need to explicitly specify the names of the fields other than the country field,(field1,field2,etc..), using the $first
operator. You can write/overwrite a collection by specifying the name of the collection in the $out
stage.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"},
"field1":{$first => "$field1"}}},
{$out => "collection"}
])
You need to explicitly specify the fields so that you don't get a redundant Countries
field.
You can use the $$ROOT
system variable to store the entire document, but that would make the Countries
field redundant.One outside the doc
and one inside the doc
.
collection.aggregate([
{$unwind => "$Countries"},
{$unwind => "$Countries"},
{$group => {"_id":"$_id","Countries":{$push => "$Countries"},
"doc":{$first => "$$ROOT"}}},
{$out => "collection"}
])
Upvotes: 9
Reputation: 14469
Your data for Countries are not in a good format, so you may consider to convert them. This is a script to flatten the array in Countries field and save it the origin documents that you can run in a mongo shell:
function flattenArray(inArr) {
var ret = [];
inArr.forEach(function(arr) {
if (arr.constructor.toString().indexOf("Array") > -1) {
ret = ret.concat(flattenArray(arr));
} else {
ret.push(arr);
}
});
return ret;
}
db.collection.find({
'Countries': {
'$exists': true
}
}).forEach(function(doc){
doc.Countries = flattenArray(doc.Countries);
db.collection.save(doc);
});
Upvotes: 4
Reputation: 6371
Try this:
db.test2.aggregate([
{"$unwind" : "$Countries"},
{"$unwind" : "$Countries"},
{$group : { _id : '$_id', Countries: { $addToSet: "$Countries" }}},
]).result
Upvotes: 4