Alexander Diener
Alexander Diener

Reputation: 21

MongoDB: select all fields + all matched elements of subcollection

Here is one document from my mongoose collection:

{ _id: 55ae7be99e772a7c025a0a7b,
   id: 'foo',
   isBoolean: true,
   kind: 'bar',
   values: 
     [ 
       { y: 0,
         x: Wed Aug 26 2015 11:12:56 GMT+0200 (Mitteleuropäische Sommerzeit),
         _id: 55ae7ae05596dd740eb8a204 },
       { y: 0,
         x: Wed Aug 26 2015 11:12:57 GMT+0200 (Mitteleuropäische Sommerzeit),
         _id: 55ae7ae05596dd740eb8a203 }, 
       { y: 1,
         x: Wed Aug 26 2015 11:12:56 GMT+0200 (Mitteleuropäische Sommerzeit);
       _id: 55ae7be91fa1511c1795c5ae } 
     ]
  }

So, I need to find all documents, that have specific value.x. After that I need to return that document with all fields and found value elements.


Wenn I try it with

.find({'values.x': mTime1})
      .select({
          '_id'        : 1 ,
          'id'         : 1 ,
          'kind'       : 1 ,
          'isBoolean'  : 1 ,
          'values'     : {$elemMatch: {x: time1}}
          })

I receive just the firsTt found value:

 { ...
     values: 
      [ { exceeds: null,
         y: 0,
         x: Wed Aug 26 2015 11:12:56 GMT+0200 (Mitteleuropäische Sommerzeit),
         _id: 55ae7d86870b92b8056bed4c } ]
    }

Next version

.aggregate({"$unwind" : "$values"}, {"$match" : {"values.x": time1}},
             {"$group" : {
                    '_id'        : '$_id',
                    'values'     : {$addToSet: "$values"}
             });

returns all matched values except other fields...


My goal is:

{ _id: 55ae7be99e772a7c025a0a7b,
   id: 'foo',
   isBoolean: true,
   kind: 'bar',
   values: 
     [ 
       { y: 0,
         x: Wed Aug 26 2015 11:12:56 GMT+0200 (Mitteleuropäische Sommerzeit),
         _id: 55ae7ae05596dd740eb8a204 },
       { y: 1,
         x: Wed Aug 26 2015 11:12:56 GMT+0200 (Mitteleuropäische Sommerzeit);
       _id: 55ae7be91fa1511c1795c5ae } 
     ]
  }

Have you any idea, how to achieve that with mongoose?? :)


Update: Thanks to tsturzl, I solved it with next function (without changing the model):

 self.aggregate(
                  {'$unwind' : '$values'}, 
                  {'$match' : { 'values.x': mTime1} },
                  {'$group' : {
                    '_id'        : '$_id',
                    'values'     : {$push: '$values'}
                  }}
              )
      .exec(
          function(err, results) {
            if(err) return done(err);

            var values = {}; // hashMap to group values

            results.forEach(function(item) {
              if(item.values) // prevent empty results
                values[item._id] = item.values;
            });

            self.find({_id:{$in: _.keys(values)}})
                .exec(function(err, items) {
                    if(err) return done(err);

                    var results = items.map(function(item) {
                        item.values = values[item._id];
                        return item;
                    });

                    done(err, results); // callback results
                });
        });

Upvotes: 2

Views: 396

Answers (1)

tsturzl
tsturzl

Reputation: 3137

The problem with using elemMatch in a projection is that it accesses a single item. Similar to an array arr[1], elemMatch fetches the index of the item in the array and then projects that array item at that index. So you can only retrieve one sub-document using this method.

You can use an aggregation similar to this

[
    {$match: {'values.x': mTime1}}, //match before to reduce size of unwound 
    {$unwind: '$values'},
    {$match: {'values.x': mTime1}},
    {$group: {
        _id: '$_id',
        id: {$first: '$id'},
        kind: {$first: '$kind'},
        isBoolean: {$first: '$isBoolean'},
        values: {$push: '$values'}
    }
]

I've tested this to work fine locally on an array of subdocuments.

It's possible that your approach is best suited to being restructured. You should remodel your data so that your values have their own collection and reference by _id. In this cause I would store the reference in the values collection.

Remove values field from this collection

{ _id: 55ae7be99e772a7c025a0a7b,
   id: 'foo',
   isBoolean: true,
   kind: 'bar'
}

Values Model:

{
   y: Number,
   x: Date,
   parentId: {type: ObjectId, ref: "myColl"} //make sure you require ObjectId and rename the reference
}

You can then do something like this

ValuesModel.find({
    x: mTime1
}).exec(function(err, results) {
    var ids = {}; //hashMap to group values
    var idsArr = []; //array of ids

    results.forEach(function(item) {
        if(!ids.hasOwnProperty(items.parentId.toString())) {
            ids[items.parentId.toString()] = [];
            idArr.push(item.parentId);
        }
        ids[items._id.toString()].push(item);
    });

    myColl.find({_id:{$in: idsArr}})
        .exec(function(err, items) {
            var results = items.map(function(item) {
                item.values = ids[item._id.toString()];
                return item;
            });

            done(results); //callback results
        });
});

This will grab all values that you queried for, then group them in a hashMap and push all the parentIds to an array. Then I query for that array of parentIds. I take the hashMap, reference it by the id in the hashMap and create a new field for .values in the parent document. This will prevent you from having to use aggregations, which aren't as scalable, and will allow you to easily query the values table. If you want to find only one value you can simply use the mongoose populate method. The downfall to this approach is that you need to do more work in your code, and you have 2 round trips. However, this should still be more efficient than an aggregation.

This can be used to create a reusable method to simplify your code if you query into values a lot

function queryValues(query, done) {
    ValuesModel.find(query).exec(function(err, results) {
        if(err) return done(err);

        var ids = {}; //hashMap to group values
        var idsArr = []; //array of ids
        results.forEach(function(item) {
            if(!ids.hasOwnProperty(items.parentId.toString())) {
                ids[items.parentId.toString()] = [];
                idArr.push(item.parentId);
            }
            ids[items._id.toString()].push(item);
        });

        myColl.find({_id:{$in: idsArr}})
            .exec(function(err, items) {
                if(err) return done(err);

                var results = items.map(function(item) {
                    item.values = ids[item._id.toString()];
                    return item;
                });

                done(null, results); //callback results
            });
    });
}

Then you can just call queryValues({x: mTime1}, function(err, results){...});, and you can pass any query you want and the function will handle populating the parent document without fetching duplicate data for maximum efficiency.

One thing I might also recommend is that you define this method as a schema static method in your model definition so you can tuck this code away and never have to worry. See: http://mongoosejs.com/docs/api.html#schema_Schema-static

Upvotes: 1

Related Questions