Tasos
Tasos

Reputation: 7587

MongoDB query in pymongo with sort feature

I am new in MongoDB and I am trying to create a query.

I have a list, for example: mylist = [a,b,c,d,e] My dataset has one key with a similar list: mydatalist = [b,d,g,e]

I want to create a query that will return all the data that contains at least one from the mylist.

What I have done.

query = {'mydatalist': {'$in': mylist}}
selector = {'_id':1,'name':1}
mydata = collection.find(query,selector)

That's work perfect. The only thing I want to do and I cannot is to sort the results in base of the number of mylist data they have in the mydatalist. Is there any way to do this in the query or I have to do it manually after in the cursor?

Update with an example:

mylist = [a,b,c,d,e,f,g]

#data from collection
data1[mydatalist] = [a,b,k,l] #2 items from mylist
data2[mydatalist] = [b,c,d,e,m] #4items from mylist
data3[mydatalist] = [a,u,i] #1 item from mylist

So, I want the results to be sorted as data2 -> data1 -> data3

Upvotes: 0

Views: 234

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151200

So you want the results sorted by the number of matches to your array selection. Not a simple thing for a find but this can be done with the aggregation framework:

 db.collection.aggregate([
    // Match your selection to minimise the 
    {$match: {list: {$in: ['a','b','c','d','e','f','g']}}},

    // Projection trick, keep the original document
    {$project: {_id: {_id: "$_id", list: "$list" }, list: 1}},

    // Unwind the array
    {$unwind: "$list"},

    // Match only the elements you want
    {$match: {list: {$in: ['a','b','c','d','e','f','g']}}},

    // Sum up the count of matches
    {$group: {_id: "$_id", count: {$sum: 1}}},

    // Order by count descending
    {$sort: {count: -1 }},

    // Clean up the response, however you want
    {$project: { _id: 0, _id: "$_id._id", list: "$_id.list", count: 1 }}
])

And there you have your documents in the order you want:

{
    "result" : [
        {
            "_id" : ObjectId("5305bc2dff79d25620079105"),
            "count" : 4,
            "list" : ["b","c","d","e","m"]
        },
        {
            "_id" : ObjectId("5305bbfbff79d25620079104"),
            "count" : 2,
            "list" : ["a","b","k","l"]
        },
        {
            "_id" : ObjectId("5305bc41ff79d25620079106"),
            "count" : 1,
            "list" : ["a","u","i"]
        }
    ],
    "ok" : 1
}

Also, it is probably worth mentioning that aggregate in all recent driver versions will return a cursor just as is the case with find. Currently this is emulated by the driver, but as of version 2.6 it will really be for real. This makes aggregate a very valid "swap-in" replacement for find in your implemented calls.

Upvotes: 2

Related Questions