Reputation: 1865
I have some documents in a MongoDB collection with this schema:
{
"_id": {
"$oid": "60c1e8e318afd80016ce58b1"
},
"searchPriority": 1,
"isLive": false,
"vehicleCondition": "USED",
"vehicleDetails": {
"city": "Delhi"
}
},
{
"_id": {
"$oid": "60c1f2f418afd80016ce58b5"
},
"searchPriority": 2,
"isLive": false,
"vehicleCondition": "USED",
"vehicleDetails": {
"city": "Delhi"
}
},
{
"_id": {
"$oid": "60cb429eadd33c00139d2be7"
},
"searchPriority": 1,
"isLive": false,
"vehicleCondition": "USED",
"vehicleDetails": {
"city": "Gurugram"
}
},
{
"_id": {
"$oid": "60c21be618afd80016ce5905"
},
"searchPriority": 2,
"isLive": false,
"vehicleCondition": "USED",
"vehicleDetails": {
"city": "New Delhi"
}
},
{
"_id": {
"$oid": "60e306d29e452d00134b978f"
},
"searchPriority": 3,
"isLive": false,
"vehicleCondition": "USED",
"vehicleDetails": {
"city": "New Delhi"
}
}
vehicleCondition
can be NEW
or USED
, isLive
can be true
or false
and searchPriority
will be an integer between 1 to 3. (lower number means it should be higher in search result)
Here, except _id
none of the other fields are unique. I have created a compound index on isLive
, vehicleDetails.city
and searchPriority
.
In my application I will perform some queries of this form:
isLive
is true
, vehicleDetails.city
is
either Delhi
or New Delhi
or Gurugram
and vehicleCondition
is
USED
(or NEW
).For this, I can do a find query like this:
db.collection.find({"isLive": true, "vehicleDetails.city": { $in: [ "Gurugram", "Delhi", "New Delhi" ] }, "vehicleCondition": "USED" }, {})
I want the results of this query sorted in this order:
$in
arrray in the find query, having lowest priority$in
arrray in the find query, having 2nd lowest priority$in
arrray in the find
query, having 3rd lowest priority$in
arrray in the find query, having lowest priority$in
arrray in the find query, having 2nd lowest priority$in
arrray in the find query, having 3rd lowest priority
All cars belonging to the 3rd city inside $in
arrray in the find query, having lowest priority$in
arrray in the find query, having 2nd lowest priority$in
arrray in the find query, having 3rd lowest priorityHow can I do this? Since the number of documents returned by this query could be very large, I will be using pagination to limit the number of returned documents. Will this extra requirement have any effect on the possible solution for this problem?
Upvotes: 5
Views: 1745
Reputation: 688
I hope this would work for you
let x = ["Gurugram","Delhi", "New Delhi"];
db.collection.aggregate([
{
$match: {
"isLive": true,
"vehicleDetails.city": {
$in: x
},
"vehicleCondition": "USED"
}
},
{
$project: {
"_id": 1,
"searchPriority": 1,
"isLive": 1,
"vehicleCondition": 1,
"vehicleDetails": 1,
index: { $indexOfArray: [x, "$vehicleDetails.city"] },
}
},
{ $sort: { index: 1, searchPriority: 1 } },
{
$project: {
"index": 0,
}
}
]).toArray();
Upvotes: 1
Reputation: 22316
So I've read the other answer ( which gives a technical solution ) however based on your comments and request it is not suitable.
So firstly using aggregate
here while technically solves the problem has some issues.
As you mentioned the query can have a high amount of documents matching, the aggregation pipeline unlike the find
method does indeed load ALL of them into memory, this will inventively cause performance issue's, I also saw you mention something about not having an index. this will cause a "collection" scan for every single API call.
What I recommend you do is:
First you absolutely must build a compound index on isLive, vehicleCondition, "vehicleDetails.city"
in case you don't have one. this is simply a must for high scale usage.
Now that we got that out of the way I recommend you split your call into into several pieces, I'm going to paste some puesdo code that might look a little all over the places but I do believe this is the best approach you can achieve using Mongo as each of these queries is suppose to be efficient by using the previously built index.
I will briefly explain the methodology, We want to be able to query each city independently of the other cities so we can use the "custom sort" function without needing to load all the matches into memory.
To do so we need to know how much each city needs to "skip" and "limit", as city #2 (Delhi) limit for example will be ( limit - city#1 (Gurugram) matches ).
So here is the pseudo code, I left it simple on purpose so it will be understandable. I will however add some ideas at the end for some basic improvements.
let limit = 10; // determined by req?
const skip = 0; // determined by req?
const cities = ['Gurugram', 'Delhi', 'New Delhi'];
// we need this to resolve the proper skip / limit. the last city is not relevant.
const countPromises = [];
for (let i = 0; i < cities.length - 1; i++) {
countPromises.push(db.collection.countDocuments({
'isLive': true,
'vehicleDetails.city': cities[i],
'vehicleCondition': 'USED',
}));
}
await Promise.all(countPromises);
// first city initial skip
const citySkips = [skip];
for (let i = 0; i < countPromises.length - 1; i++) {
// if we have x results in the first city then we need to skip-x skipping for the next city.
citySkips.push(Math.max(skip - countPromises[0], 0));
}
let finalResults = [];
for (let i = 0; i < cities.length; i++) {
// assuming we skip over ALL city i results.
if (citySkips[i] >= countPromises[i]) {
continue;
}
const cityLimit = limit - finalResults.length;
if (cityLimit <= 0) {
break;
}
const cityResults = await db.collection.find({
'isLive': true,
'vehicleDetails.city': cities[i],
'vehicleCondition': 'USED',
}).sort({ sortPriority: 1 }).skip(citySkips[i]).limit(cityLimit);
finalResults = finalResults.concat(cityResults);
}
Ok so possible improvements you can make:
countDocuments
part which is needed to determine the skip
and limit
of each of the cities.for
loop can be a Promise.all
similar to the count to speed up results. Again if the number of cities if never too too high this could be a good solution.Upvotes: 1
Reputation: 75984
You could add a custom sort order field after the match to sort the fields based on the input city criteria order and search priority order. Since it is calculated field index will not be used for sorting.
You can add pagination at the end just you would for any other query.
Something like
db.collection.aggregate([
{
$match: {
"isLive": true,
"vehicleDetails.city": {
$in: [
"Gurugram",
"Delhi",
"New Delhi"
]
},
"vehicleCondition": "USED"
}
},
{
"$addFields": {
"cityIndex": {
"$indexOfArray": [
[
"Gurugram",
"Delhi",
"New Delhi"
],
"$vehicleDetails.city"
]
}
}
},
{
$sort: {
cityIndex: 1,
sortPriority: 1
}
},
{
$project: {
cityIndex: 0
}
}
])
Working example can be found here - https://mongoplayground.net/p/16-YOkPotLX
Upvotes: 1