Reputation: 956
Say I have some data that look something like this.
[
{ "foo": 1, "bar": "a" },
{ "foo": 2, "bar": "b" },
{ "foo": 3, "bar": "b" },
{ "foo": 4, "bar": "a" }
]
And I want group the rows by bar
, and store the matching results under the value of bar in a object/map like data structure, something like this.
{
"a": [{ "foo": 1, "bar": "a" }, { "foo": 4, "bar": "a" }],
"b": [{ "foo": 2, "bar": "b" }, { "foo": 3, "bar": "b" }]
}
What kind of functionality is there in Mongo to handle this kind of transformation (or something remotely similar)?
I'm writing a web application using node.js, and I've managed to do this transformation in javascript (I know...), however whenever a user query is too open ended with very few constraints, it results in a memory over allocation, as I'm dealing with roughly more than 250,000 rows. My grouping method works when the query returns less than 10000ish rows.
After this, I've realised I should probably get Mongo to handle this for me especially since it causes the server to hang until the grouping is done. I'm just unsure how to perform such a transformation.
Before I tried to get web workers on the client to do it after transferring the data by web sockets & split the job via web workers but it made the interface chug very noticeably, & it likely wouldn't be stand up under pressure. But at least this didn't block the server, or crash my server.
Say if have a list of all the possible values of bar
(which I have stored in a collection), should I just make multiple queries? & from the results form the data structure?
Upvotes: 0
Views: 98
Reputation: 151172
Fairly simple for the aggregation framework, and the fastest option since the aggregation framework is a native code implementation where as something like mapReduce requires invoking a JavaScript interpreter in process:
db.collection.aggregate([
{ "$group": {
"_id": "$bar",
"values": { "$push": { "foo": "$foo", "bar": "$bar" } }
}
])
Which does require some knowledge of your document structure. If you need something more flexible to varying documents then you can use mapReduce:
db.collection.mapReduce(
function() {
emit( this.bar, this );
},
function(key, values) {
return { values: values };
},
{ "out": { "inline": 1 }
)
Not as pretty as your result but it gets the job done.
The only real question here is whether the arrays you are generating here are actually going to blow up the 16MB BSON limit. So you might need to segment this somewhat, but it still would be better than fetching all the rows and constructing on the client.
Upvotes: 2