Peanuts-83
Peanuts-83

Reputation: 3

Get all values of arrays from mongo documents

I have a mongo collection with documents containing arrays:

   { item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
   { item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
   { item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
   { item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
   { item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }

I would like to get a single array containing all distinct values, such as:

tags: ["blank", "red", blue"] and dim_cm: [14,21,22.85,30,10,15.25]

Is this possible with an aggregation pipeline?

Upvotes: 0

Views: 869

Answers (2)

Takis
Takis

Reputation: 8693

Query

  • put both in one array
  • unwind
  • at group time check the type(string or not) and put to the right group

*here types are different, in case they were the same type, we can do another trick like put the in an array of pairs [[tag,cm] ...] where first would be the tag and the second would be the cm, or array of documents
For perfomance if you test it send how it went if you can

Playmongo

aggregate(
[{"$project": {"tags-dim": {"$concatArrays": ["$tags", "$dim_cm"]}}},
 {"$unwind": "$tags-dim"},
 {"$group": 
   {"_id": null,
    "tags": 
     {"$addToSet": 
       {"$cond": 
         [{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$tags-dim",
          "$$REMOVE"]}},
    "dim_cm": 
     {"$addToSet": 
       {"$cond": 
         [{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$$REMOVE",
          "$tags-dim"]}}}}])

Upvotes: 0

nimrod serok
nimrod serok

Reputation: 16033

You can use $group with $reduce and $setIntersection:

  1. $group all documents to create one array of arrays per key
  2. flatten each array with $reduce and make it a set using $setIntersection.
db.collection.aggregate([
  {$group: {_id: null, tags: {$push: "$tags"}, dim_cm: {$push: "$dim_cm"}}},
  {
    $project: {
      _id: 0,
      tags: {
        $setIntersection: [
          {$reduce: {
              input: "$tags",
              initialValue: [],
              in: {$concatArrays: ["$$value", "$$this"]}
            }
          }
        ]
      },
      dim_cm: {
        $setIntersection: [
          {$reduce: {
              input: "$dim_cm",
              initialValue: [],
              in: {$concatArrays: ["$$value", "$$this"]}
            }
          }
        ]
      }
    }
  }
])

See how it works on the playground example

Another way is:

db.collection.aggregate([
  {$unwind: "$tags"},
  {
    $group: {
      _id: null,
      tags: {$addToSet: "$tags"},
      dim_cm: {$addToSet: "$dim_cm"
    }
  },
  {$unwind: "$dim_cm"},
  {$unwind: "$dim_cm"},
  {
    $group: {
      _id: null,
      tags: {$first: "$tags"},
      dim_cm: {$addToSet: "$dim_cm"}
    }
  }
])

Playground - unwind

Which you can split into two queries which will be much faster:

db.collection.aggregate([
  {$unwind: "$tags"},
  {
    $group: {
      _id: null,
      tags: {$addToSet: "$tags"}
    }
  },
])

A 3rd option is:

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      arr: {
        $concatArrays: [
          {$map: {input: "$tags", as: "item", in: {k: "tag",  v: "$$item"}}},
          {$map: {input: "$dim_cm", as: "item", in: {k: "dim_cm", v: "$$item"}}}
        ]
      }
    }
  },
  {$unwind: "$arr"},
  {
    $group: {
      _id: null,
      tags: {
        $addToSet: {$cond: [{$eq: ["$arr.k", "tag"]}, "$arr.v", "$$REMOVE"]}
      },
      dim_cm: {
        $addToSet: {$cond: [{$eq: ["$arr.k", "dim_cm"]}, "$arr.v", "$$REMOVE"]}
      }
    }
  }
])

Playground 3rd

Upvotes: 2

Related Questions