bodokaiser
bodokaiser

Reputation: 15752

MongoDB aggregate object map to distinct values

I got three documents:

[
  { _id: 1, article: 1, details: { color: "red" } },
  { _id: 2, article: 1, details: { color: "blue", size: 44 },
  { _id: 3, article: 2, details: { color: "blue", size: 44 }
]

which I want to transform in a query to:

[
  { article: 1, details: { color: ["red", "blue"], size: [44] } },
  { article: 2, details: { color: ["blue"], size: [44] }
]

at the moment this is achieved by mapReduce:

db.varieties.mapReduce(map, reduce, { out: { inline: 1 } });

function map() {
  for (var key in this.details) {
    this.details[key] = [this.details[key]];
  }

  emit(this.article, this.details);
}

function reduce(article, details) {
  var result = {};

  details.forEach(function(detail) {
    for (var key in detail) {
      if (!Array.isArray(result[key])) result[key] = [];
      if (~result[key].indexOf(detail[key])) result[key].concat(detail[key]);
    }
  });
  return result;
}

However I would like to do this through the mongodb aggregation frame work as the map reduce implementation in my environment is very "difficult".

Regarding the aggregation I am so far:

var pipeline = [];

pipeline.push({ $project: { article: 1, details: 1 } });
pipeline.push({ $group: { _id: "$article", details: { $push: '$details' } });

db.varieties.aggregate(pipeline);

However this will just return:

[
  { article: 1, details: [{ color: "red", size: 44 }, { color: "blue", size: 44 }] },
  { article: 2, details: [{ color: "blue", size: 44 }]
]

I read somewhere that this a use case for $unwind unfortunately this will not work on objects.

So lets get to my questions:

  1. Is it possible to somehow convert the details object to an array with { key: "color", value: "red" } and if yes how to achieve this?
  2. If the above is not possible and I would restructure my documents to the be stored in the above format (details as array) how would I need to complete my aggregation to get the same result from my origin mapReduce?

I cannot hardcode the keys of details. The aggregation must work on details of unknown keys.

Upvotes: 2

Views: 4402

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151122

You are better of using the aggregation framework:

db.colors.aggregate([
    { "$group": {
        "_id": "$article",
        "color": {"$addToSet": "$details.color" },
        "size": { "$addToSet": "$details.size" }
    }},
    { "$project": {
        "details": {
            "color": "$color",
            "size": "$size"
        }
    }}
])

Produces:

{ "_id" : 2, "details" : { "color" : [ "blue" ], "size" : [ 44 ] } }
{ "_id" : 1, "details" : { "color" : [ "blue", "red" ], "size" : [ 44 ] } }

So you cannot have those keys under "details" when you $group but you can always $project to the form you want in the result.

The aggregation framework is a native code implementation and runs much faster than the JavaScript interpreter driven mapReduce.

But if you really need the flexibility the concept is similar, it simply takes a bit longer but will work with different keys under details:

db.colors.mapReduce(
  function () {
    emit( this.article, this.details );
  },
  function (key,values) {

      var reduced = {
      };

      values.forEach(function(value) {
        for ( var k in value ) {
          if ( !reduced.hasOwnProperty(k) )
            reduced[k] = [];
          if ( reduced[k].indexOf( value[k] ) == -1 )
            reduced[k].push( value[k] );
        }

      });

      return reduced;

  },
  {
      "finalize": function(key,value) {

        for (var k in value) {
          if ( Object.prototype.toString.call( value[k] ) !== '[object Array]') {
            var replace = [];
            replace.push( value[k] );
            value[k] = replace;
          }

        }

        return value;
      },
      "out": { "inline": 1 }
  }
)

But that is all in a very "mapReduce" way, so the values of the main fields are going to be different.

{ "_id" : 1, "value" : { "color" : [ "blue", "red" ], "size" : [ 44 ] } }
{ "_id" : 2, "value" : { "color" : [ "blue" ], "size" : [ 44 ] } }

Upvotes: 3

Related Questions