Ian
Ian

Reputation: 34509

How to aggregate percentages within arrays?

I'm trying to work out exactly how to achieve an aggregation, I could manually unwind and group back together at the end, but I'm sure I should be able to achieve this in a more concise way so I wanted to throw it out as I'm getting stuck.

My document structure (skipping out the un-interesting bits) looks like:

{
    _id: ObjectId,
    panels: [
        {
            visConfig: {
                 dataConfig: {
                     columns: [
                         { element: "DX" },
                         { element: "SE" },
                     ]
                 }
            }
        },
        {
            visConfig: {
                 dataConfig: {
                     columns: [
                         { element: "AB" },
                         { element: "XY" },
                     ]
                 }
            }
        }
    ]
}

What I want to do is calculate a percentage of the element overlaps with a given set to be provided. So for example for the document shown it would produce 25% for the set ["DX"] or 50% for the set ["DX", "AB"].

So I've tried a few things, I think I've settled on the nearest so far as:

$project: {
   _id: 1,
   total: { $sum: { $size: "$panels.visConfig.dataConfig.columns" } }
}

But I'm getting an error here which I don't understand:

The argument to $size must be an array, but was of type: missing

Then I'm also having issues with my conditional aggregation which seems to be returning 0 for all of the element values.

{
  _id: 1,
  "panels.visConfig.dataConfig.columns.element": { 
      $sum: {
        $cond: [{
          $setIsSubset: [
            ["DX"], ["$panels.visConfig.dataConfig.columns.element"]
          ] 
        }, 1, 0 ],
      }
  },
}

Upvotes: 3

Views: 319

Answers (3)

s7vr
s7vr

Reputation: 75964

You can try below aggregation in 3.4 version.

db.colname.aggregate([
{"$project":{
  "_id":1,
  "total":{
    "$reduce":{
      "input":"$panels.visConfig.dataConfig.columns.element",
      "initialValue":0,
      "in":{"$add":["$$value",{"$size":"$$this"}]}
    }},
    "match":{
      "$sum":{
        "$map":{
          "input":"$panels.visConfig.dataConfig.columns.element",
          "in":{
            "$size":{
              "$setIntersection":[["DX","AB"],"$$this"]
            }
          }
        }
      }
    }
}},
{"$project":{
  "_id":1,
  "percent":{"$multiply":[{"$divide":["$match","$total"]}, 100]}
}}])

Update - You can perform both match and total calculations in $reduce pipeline.

db.colname.aggregate([
{"$project":{
  "_id":1,
  "stats":{
    "$reduce":{
      "input":"$panels.visConfig.dataConfig.columns.element",
      "initialValue":{"total":0,"match":0},
      "in":{
        "total":{"$add":["$$value.total",{"$size":"$$this"}]},
        "match":{"$add":["$$value.match",{"$sum":{"$map":{"input":"$$this","in":{"$cond":[{"$in":["$$this", ["DX","AB"]] }, 1, 0]}}}}]}

       }
    }}
}},
{"$project":{
  "_id":1,
  "percent":{"$multiply":[{"$divide":["$stats.match","$stats.total"]}, 100]}
}}])

Upvotes: 3

Sede
Sede

Reputation: 61253

Well, there are couple of ways to do this, but I these two pipelines show how I would do it.

var values = ["DX", "KL"]
  • First approach

    [
       {
          "$project": {
             "percent": {
                "$let": {
                   "vars": {
                      "allsets": {
                         "$reduce": {
                            "input": "$panels.visConfig.dataConfig.columns",
                            "initialValue": [],
                            "in": {
                               "$concatArrays": [ "$$this.element", "$$value" ]
                            }
                         }
                      }
                   },
                   "in": {
                      "$multiply": [
                         {
                            "$divide": [
                               {
                                  "$size": {
                                     "$setIntersection": [ "$$allsets", values ]
                                  }
                               },
                               { "$size": "$$allsets" }
                            ]
                         },
                         100
                      ]
                   }   
                 }
             }
          }
       }
    ]
    
  • Second approach same idea here but, using one pipeline stage.

    [
       {
          "$project": {
              "percent": {
                "$multiply": [
                   {
                      "$divide": [
                         {
                            "$sum": {
                               "$map": {
                                  "input": "$panels.visConfig.dataConfig.columns.element",
                                  "in": {
                                     "$size": {
                                       "$setIntersection": [ values, "$$this" ]
                                     }
                                  }
                               }
                            }
                         },
                         {
                            "$reduce": {
                               "input": "$panels.visConfig.dataConfig.columns.element",
                               "initialValue": 0,
                               "in": {
                                  "$add": [ "$$value", { "$size": "$$this" } ]
                               }
                            }
                         }
                      ]
                   },
                   100
                ]
             }
          }
       }
    ]
    

Upvotes: 0

mickl
mickl

Reputation: 49975

You can use $map + $reduce to get an array of all element values and then using $divide you can divide $filter-ed $size by total $size:

db.col.aggregate([
    {
        $project: {
            elements: {
                $reduce: {
                    input: { 
                        $map: { 
                            input: "$panels", 
                            as: "panel", 
                            in: "$$panel.visConfig.dataConfig.columns.element" 
                        } 
                    },
                    initialValue: [],
                    in: { $concatArrays: [ "$$this", "$$value" ] }
                }
            }
        }
    },
    {
        $project: {
            percentage: {
                $divide: [ 
                    { 
                        $size: { 
                            $filter: { 
                                input: "$elements", 
                                as: "element", 
                                cond: { 
                                    $in: [ 
                                        "$$element", 
                                        [ "AB", "XY" ] // your input here
                                    ] 
                                } 
                            } 
                        } 
                    },
                    { $size: "$elements" }
                ]
            }
        }
    }
])

Upvotes: 1

Related Questions