Only return some of the fields from list of embedded documents

Question

Here is my document:

{
"_id" : "2",
"account" : "1234",
"positions" : { 
  "APPL" : { "quantity" : "13", "direction" : "long" }, 
  "GOOG" : { "quantity" : "24", "direction" : "long" }
 }
}

I would like to get the whole positions object, but only the quantity field and ignore the direction field. Is it possible to do that? Or should I consider this other schema (array of objects):

{
"_id" : "2",
"account" : "1234",
"positions" : [ 
  { "name" : "APPL", "quantity" : "13", "direction" : "long" }, 
  { "name" : "GOOG", "quantity" : "24", "direction" : "long" }
 ]
}

Many thanks!

Neil Lunn · Accepted Answer

For the "array" form, all you really need to do is specify the field using "dot notation" to the array member:

db.collection.find({}, { "positions.quantity": 1, })

Which would return:

{
  "_id" : "2",
  "positions" : [
    { "quantity" : "13" },
    { "quantity" : "24" }
  ]
}

Or for multiple fields but excluding the "direction" just use both in projection:

db.collection.find({},{ "positions.name": 1, "positions.quantity": 1 })

Which returns the named fields still:

{
  "_id" : "2",
  "positions" : [
    {
            "name" : "APPL",
            "quantity" : "13"
    },
    {
            "name" : "GOOG",
            "quantity" : "24"
    }
  ]
}

For the "named keys" form you need to specify each path:

db.collection.find({},{ "positions.APPL.quantity": 1, "positions.GOOG.quantity": 1  })

Which would return of course:

{
  "_id" : "2",
  "positions" : {
    "APPL" : {
      "quantity" : "13"
    },
    "GOOG" : {
      "quantity" : "24"
    }
  }
}

And that kind of "nastiness" is pervasive with basically ALL MongoDB operations, query or projection or otherwise. When you used "named keys" the "database" has no sane option other than to require you to "name the path". Doing that is of course not really a practical exercise, when the names of keys are likely to differ between documents in the collection.

Traversing keys can only really be done in JavaScript evaluation from a MongoDB standpoint. Since JavaScript evaluation requires interpreter cost in launching and translating data from BSON to a workable JavaScript format, and not to mention the actual cost of evaluating the coded expressions themselves, that is not an ideal approach.

Moreover, from a "query" perspective such handling requires the use of $where to evaluate such an expression where you just want to look for things under each "key" of the "positions" data. This is a "bad" thing, since such an expression cannot possible use an "index" to optimize the query search. Only with a "directly named path" can you actually use or even "create" an index under those conditions.

From a "projection" perspective, the usage of "named keys" means that by similar "traversal" concepts, you actually need JavaScript processing again to do so. And the only mechanism in which MongoDB can use a JavaScript expression to "alter" the output document is by using mapReduce so again this is "super horrible" and you would be using this "aggregation method" for nothing more than document manipulation in this case:

db.collection.mapReduce(
    function() {
        var id = this._id;
        delete this._id;
        Object.keys(this.positions).forEach(function(el) {
            delete el.direction;
        });
        emit(id,this);
    },
    function() {},    // reducer never gets called when all are unique
    { "out": { "inline": 1 } }
)

Even after you did that to just avoid naming paths, the output of mapReduce cannot be a "cursor". So this limits you to either the size of a BSON document in response or actually outputting to a "collection" instead. So this is as far from "practical" as you can get.

There are numerous reasons "why" using an array with "common paths" is so much better than a "named keys" structure that are also far too broad to go into here. The one thing you should accept is that "named keys are bad, okay!" and just move forward with consistent object naming that actually makes quite a lot of sense.

Only return some of the fields from list of embedded documents

Answers (1)

Related Questions