ygjack
ygjack

Reputation: 53

How to get multiple average values from matching elements in array

I have collection like below:

[{
    date: '20170721',
    pageUrl: 'page1',
    timing: [{ name: 'dns', duration: 1000 }, { name: 'tcp', duration: 2000 }]
}, {
    date: '20170721',
    pageUrl: 'page2',
    timing: [{ name: 'dns', duration: 1001 }, { name: 'tcp', duration: 1800 }]
}, {
    date: '20170722',
    pageUrl: 'page1',
    timing: [{ name: 'dns', duration: 1021 }, { name: 'tcp', duration: 1700 }]
}, {
    date: '20170722',
    pageUrl: 'page2',
    timing: [{ name: 'dns', duration: 1101 }, { name: 'tcp', duration: 1850 }]
}]

and I want the result of average timing of a given page during a given period of date.

For example: I need average timing data of page1, from date 20170701 - 20170731

And the expected output should like:

[{
    _id: '20170701',
    dns: <avgDuration>,
    tcp: <avgDuration>
}, {
    _id: '20170702',
    dns: <avgDuration>,
    tcp: <avgDuration>
},
...
]

what I tried is, and it didn't work:

db.myCollection.aggregate([
    { $match: { 'pageUrl': targetPageUrl } },
    { $group: {
        _id: '$date',
        dns: { $avg: '$timing.0.duration' },
        tcp: { $avg: '$timing.1.duration' }
    },
    ...
])

Anybody could help? Please

Upvotes: 1

Views: 213

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151132

If the positions are always "fixed" then you can use $arrayElemAt:

db.myCollection.aggregate([
    { '$match': { 'pageUrl': targetPageUrl } },
    { '$group': {
        _id: '$date',
        dns: { '$avg': { '$arrayElemAt': [ '$timing.duration', 0 ] }  },
        tcp: { '$avg': { '$arrayElemAt': [ '$timing.duration', 1 ] } }
    }}
])

If they are not actually fixed, then use $filter to get the matching values:

db.myCollection.aggregate([
    { '$match': { 'pageUrl': targetPageUrl } },
    { '$group': {
        _id: '$date',
        dns: { 
          '$avg': {
            '$avg': {
              '$map': {
                'input': { 
                 '$filter': {
                   'input': '$timing',
                   'as': 't',
                   'cond': { '$eq': [ '$$t.name', 'dns' ] }
                 },
                 'as': 't',
                 'in': '$$t.duration'
               } 
            }
          }
        },
        tcp: { 
          '$avg': { 
            '$avg': {
              '$map': {
                'input': { 
                 '$filter': {
                   'input': '$timing',
                   'as': 't',
                   'cond': { '$eq': [ '$$t.name', 'tcp' ] }
                 },
                 'as': 't',
                 'in': '$$t.duration'
               }
            }
          }
        }
    }}
])

Being that with $filter it actually is possible to have "multiple matches" in an array and "reduce" them down, using $avg in the mode as "both" an accumulator and something that takes an "array" as an argument itself. Hence the "double" usage of $avg.

Or even using $indexOfArray if you feel you really must:

db.myCollection.aggregate([
    { '$match': { 'pageUrl': targetPageUrl } },
    { '$group': {
        _id: '$date',
        dns: { 
          '$avg': {
            '$arrayElemAt': [
              '$timing.duration',
              { '$indexOfArray': [ '$timing.name', 'dns' ] }
            ]
          }
        },
        tcp: { 
          '$avg': {
            '$arrayElemAt': [
              '$timing.duration',
              { '$indexOfArray': [ '$timing.name', 'tcp' ] }
            ]
          }
        }
    }}
])

Upvotes: 1

Related Questions