jpaddison3
jpaddison3

Reputation: 2006

Top documents per bucket

I would like to get the documents with the N highest fields for each of N categories. For example, the posts with the 3 highest scores from each of the past 3 months. So each month would have 3 posts that "won" for that month.


Here is what my work so far has gotten, simplified.

// simplified
db.posts.aggregate([
  {$bucket: {
      groupBy: "$createdAt",
      boundaries: [
        ISODate('2019-06-01'),
        ISODate('2019-07-01'),
        ISODate('2019-08-01')
      ],
      default: "Other",
      output: {
        posts: {
          $push: {
            // ===
            // This gets all the posts, bucketed by month
            score: '$score',
            title: '$title'
            // ===
          }
        }
      }
    }},
  {$match: {_id: {$ne: "Other"}}}
])

I attempted to use the $slice operator in between the // ===s, but go an error (below).

  postResults: {
    $each: [{
      score: '$score',
      title: '$title'
    }],
    $sort: {score: -1},
    $slice: 2
  }
An object representing an expression must have exactly one field: { $each: [ { score: \"$score\", title: \"$title\" } ], $sort: { baseScore: -1.0 }, $slice: 2.0 }

Upvotes: 2

Views: 52

Answers (1)

mickl
mickl

Reputation: 49975

$slice you're trying to use is dedicated for update operations. To get top N posts you need to run $unwind, then $sort and $group to get ordered array. As a last step you can use $slice (aggregation), try:

db.posts.aggregate([
    {$bucket: {
        groupBy: "$createdAt",
        boundaries: [
            ISODate('2019-06-01'),
            ISODate('2019-07-08'),
            ISODate('2019-08-01')
        ],
        default: "Other",
        output: {
            posts: {
            $push: {            
                score: '$score',
                title: '$title'
            }
            }
        }
    }},
    { $match: {_id: {$ne: "Other"}}},
    { $unwind: "$posts" },
    { $sort: { "posts.score": -1 } },
    { $group: { _id: "$_id", posts: { $push: { "score": "$posts.score", "title": "$posts.title" } } } },
    { $project: { _id: 1, posts: { $slice: [ "$posts", 3 ] } } }
])

Upvotes: 1

Related Questions