Reputation: 17721
I have a mongodb collection structure like this one:
var personSchema = new mongoose.Schema({
_id: ObjectId,
name: String,
// ...
alias: String
};
(I use mongoose, but this is secondary).
Since I fetch people from different sources, some of the documents can reference the same person: in this case I want to keep both people in database, and I assign them a (unique) alias to both of them.
Currently, when I need to make a query to list persons univocally, I retrieve all people, and then filter out aliases, keeping only one of them (I don't care which one), in javascript (of course I need to keep also persons with no alias). Something like this:
Person.find({}, null, function(err, persons) {
var result = [];
var aliases = [];
for (var i = 0; i < persons.length; i++) {
if (persons[i].alias && aliases.hasOwnProperty(persons[i].alias))
continue; // skip this person because it's alias was seen already
result.push(persons[i]); // add this person to result
if (persons[i].alias) // add this person alias to seen aliases
aliases[persons[i].alias] = true;
}
});
Since this is quite slow, when people count grows, I'd like to filter out duplicated aliases (and keep just one) in the mongo query, but I can't elaborate a filter which fits...
Any clue?
UPDATE: As requested i comment, I add some sample Person data:
{ "_id" : "1", "name" : "Alice" },
{ "_id" : "2", "name" : "Bob", "alias" : "afa776bea788cf4c" },
{ "_id" : "3", "name" : "Bobby", "alias" : "afa776bea788cf4c" },
{ "_id" : "4", "name" : "Zoe", "alias" : "2211293acc82329a" },
From the query I'm looking for, I'd need to get:
{ "_id" : "1", "name" : "Alice" },
{ "_id" : "2", "name" : "Bob", "alias" : "afa776bea788cf4c" },
{ "_id" : "4", "name" : "Zoe", "alias" : "2211293acc82329a" },
(getting "Bobby" instead of "Bob" would be fine too).
Of course this data structure is not mandatory, I'd accept a change suggestion, of course...
Upvotes: 0
Views: 1221
Reputation: 4055
You can do it using mongo aggregation.
As far as I understand there are documents without alias field. If it is incorrect you don't need first project operator.
Person.aggregate([
{ $project: {
alias: {$ifNull: ['$alias', "$_id"] },
name: 1
}
},
{ $group: { _id: "$alias", name: {$first: "$name"}}},
{
$project: {_id:0, name: 1}
}
], callback);
Upvotes: 0
Reputation: 10100
Using aggregation you can use the following $GROUPquery, to get the desired list:
db.collection.aggregate([ {$group:{"_id":"$alias", "name":{$first:"$name"}, "id":{$first:"$_id"}}}, {$project:{"id":1,"_id":0,"alias":"$_id","name":1}} ]);
Upvotes: 1
Reputation: 1373
Try the Model.distinct
operation.
http://mongoosejs.com/docs/api.html#query_Query-distinct
Person.distinct('alias', callback);
This should return a list of documents that have distinct values for the alias.
Upvotes: 0