ozk
ozk

Reputation: 2022

MongoDB - sort by subdocument match

Say I have a users collection in MongoDB. A typical user document contains a name field, and an array of subdocuments, representing the user's characteristics. Say something like this:

{   
    "name": "Joey",
    "characteristics": [
        {
            "name": "shy",
            "score": 0.8
        },
        {
            "name": "funny",
            "score": 0.6
        },
        {
            "name": "loving",
            "score": 0.01
        }
    ]
}

How can I find the top X funniest users, sorted by how funny they are?

The only way I've found so far, was to use the aggregation framework, in a query similar to this:

db.users.aggregate([
    {$project: {"_id": 1, "name": 1, "characteristics": 1, "_characteristics": '$characteristics'}},
    {$unwind: "$_characteristics"},
    {$match: {"_characteristics.name": "funny"}},
    {$sort: {"_characteristics.score": -1}},
    {$limit: 10}
]);

Which seems to be exactly what I want, except for the fact that according to MongoDB's documentation on using indexes in pipelines, once I call $project or $unwind in an aggregation pipeline, I can no longer utilize indexes to match or sort the collection, which renders this solution somewhat unfeasible for a very large collection.

Upvotes: 3

Views: 1104

Answers (1)

xlembouras
xlembouras

Reputation: 8295

I think you are half way there. I would do

db.users.aggregate([
  {$match: { 'characteristics.name': 'funny' }},
  {$unwind: '$characteristics'},
  {$match: {'characteristics.name': 'funny'}},
  {$project: {_id: 0, name: 1, 'characteristics.score': 1}},
  {$sort: { 'characteristics.score': 1 }},
  {$limit: 10}
])
  • I add a match stage to get rid of users without the funny attribute (which can be easily indexed).
  • unwind and match again to keep only the certain part of the data
  • keep only the necessary data with project
  • sort by the correct score
  • and limit the results.

that way you can use an index for the first match.

The way I see it, if the characteristics you are interested about are not too many, IMO it would be better to have your structure as

{  
    "name": "Joey",
    "shy": 0.8
    "funny": 0.6
    "loving": 0.01
}

That way you can use an index (sparse or not) to make your life easier!

Upvotes: 1

Related Questions