Mick Walker
Mick Walker

Reputation: 3847

MongoDB Most Efficient Way to Perform this Query

I have a requirement, that when given a list of objects, I need to exclude ones which already exist within my database.

I have gone down the traditional route, of iterating over the objects, and one at a time checking if the object exists in my mongo collection.

foreach (PickerPlace pickerPlace in param)
{
    string id = pickerPlace.id;
    IMongoQuery query = Query<Place>.Where(p => p.Id == id);
    int count = this.context.Place.AsQueryable().Count(q => query.Inject());
    if (count == 0)
    {
        filteredResults.Add(pickerPlace);
    }
}

return filteredResults;

Is this the most efficient way of doing what I am trying to achieve, somehow it feels to me like I should be performing some kind of batch operation.

Many Thanks

Update:

I have found the following code to be far far more efficient, but I would still like suggestions on how to improve further.

List<string> ids = param.Select(p => p.id).ToList();
var results = this.context.Place.Find(Query.In("Id", new BsonArray(ids))).ToList();

Upvotes: 1

Views: 358

Answers (1)

i3arnon
i3arnon

Reputation: 116666

The most efficient option is this:

var newIds = new HashSet<string>(param.Select(p => p.Id));
newIds.ExceptWith(
    places.Find(Query<Place>.In(p => p.Id, newIds))
        .SetFields(Fields<Place>.Include(p => p.Id))
        .Select(p => p.Id));

HashSet enables efficient comparisons using the item's GetHashCode (and Equals). The query returns all the existing items in a single query. The SetFields returns only the ids so the built in _id index (which is probably on RAM) is used and there's no need to even use the actual data files.

Upvotes: 1

Related Questions