Reputation: 951
I'm using MongoDB aggregation query to get data from a table. It contains many lookup stages since it should join many tables. The issue happens when there is a lesser number of records in the search result than the limit specified in the query.
const leads = await lead.aggregate([
{
$sort: {
'createdAt': -1
}
},
{
$lookup: {
from: '401leaddetails',
localField: 'LeadDetailsId',
foreignField: '_id',
as: 'LeadDetailsId',
}
},
{ $unwind: { path: '$LeadDetailsId' } },
{
$lookup: {
from: '401leadsources',
localField: 'leadSourceId',
foreignField: '_id',
as: 'leadSourceId'
}
}, { $unwind: { path: '$leadSourceId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401leadstatuses',
localField: 'status',
foreignField: '_id',
as: 'status',
}
},
{ $unwind: { path: '$status', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401leadstatuses',
localField: 'subStatus',
foreignField: '_id',
as: 'subStatus'
}
},
{ $unwind: { path: '$subStatus', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401queues',
localField: 'queueId',
foreignField: '_id',
as: 'queueId'
}
}, { $unwind: { path: '$queueId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401staffs',
localField: 'assignedTo',
foreignField: '_id',
as: 'assignedTo'
}
},
{ $unwind: { path: '$assignedTo', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401departments',
localField: 'assignedTo.departmentId',
foreignField: '_id',
as: 'assignedTo.departmentId'
}
},
{ $unwind: { path: '$assignedTo.departmentId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401tiers',
localField: 'tierId',
foreignField: '_id',
as: 'tierId'
}
}, { $unwind: { path: '$tierId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401staffs',
localField: 'salesOwner',
foreignField: '_id',
as: 'salesOwner'
}
}, { $unwind: { path: '$salesOwner', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401staffs',
localField: 'bdcOwner',
foreignField: '_id',
as: 'bdcOwner'
}
}, { $unwind: { path: '$bdcOwner', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401departments',
localField: 'bdcOwner',
foreignField: '_id',
as: 'bdcOwner.departmentId'
}
}, { $unwind: { path: '$bdcOwner.departmentId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401organizations',
localField: 'bdcOwnerOrganizationId',
foreignField: '_id',
as: 'bdcOwnerOrganizationId'
}
}, { $unwind: { path: '$bdcOwnerOrganizationId', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: '401organizations',
localField: 'salesOwnerOrganizationId',
foreignField: '_id',
as: 'salesOwnerOrganizationId'
}
}, { $unwind: { path: '$salesOwnerOrganizationId', preserveNullAndEmptyArrays: true } },
{
$match: {
...qry,
isDeleted: false,
leadStatus: { $ne: 4}
}
},
{
$skip: skip
}, {
$limit: limit
}
]).allowDiskUse(true);
Here the filter is specified like below;
qry = {
$or: [
{
'salesOwnerOrganizationId.name': {
$regex: name,
$options: 'i',
},
},
{
'bdcOwnerOrganizationId.name': {
$regex: name,
$options: 'i',
},
},
{
'LeadDetailsId.firstName': {
$regex: name,
$options: 'i',
},
},
{
'LeadDetailsId.lastName': {
$regex: name,
$options: 'i',
},
},
{
'LeadDetailsId.email': {
$regex: name,
$options: 'i',
},
},
{
'LeadDetailsId.phone': {
$regex: name,
$options: 'i',
},
},
{
'LeadDetailsId.zipCode': {
$regex: name,
$options: 'i',
},
},
{
leadId: {
$regex: name,
$options: 'i',
},
},
],
};
This works fine when there are more than 10 records (when 10 is the limit specified). It took about 25.6s to return the result. This table is indexed. Am I missing something?
Upvotes: 1
Views: 217
Reputation: 22296
This has nothing to do with your query.
Let me first present an extreme example that will make this issue clear.
Imagine you have 1 million documents in your database, each document is identical and is in the form of {x : 5}
.
Now if I query db.collection.findOne({x: 5})
Mongo will immediately fetch me the first document it see's and finish the query, but what happens if i query db.collection.findOne({x: 6})
Mongo will have to scan the entire collection, only then it will be sure x:6
does not exist.
Now let's go back to your problem, this is exactly the same case just with "real life" data and "real life" value distribution. When you have "less" matches than the limit, it means Mongo has to either collection scan the entire collection to make sure there are no more, or fully scan the index tree if the query is covered to make sure there are no more matches.
An easy way to do a sanity check and to replicate results will be to select a query that's "fast", let's say that query has 1k results, in that case you should skip: 999
and then do a limit: 10
. this is basically be similar behavior to what you're experiencing with your "slow" query.
Upvotes: 2