Reputation: 443
Let's suppose I have a webPages
collection in mongoDB with following document structure inside:
{
_id: ObjectId,
url: String,
title: String,
links: [ComplexObject]
}
where ComplexObject
is a complex object, with fields and lists inside (details are not important).
Question: what type of index & query combination should I use to be able to query those documents where links
field is not null
?
Note that collection has millions of documents, so the query has to use an index, otherwise it will be slow.
Also note that if not necessary, I don't want to have an index on the field links
(besides the index which speeds up the non-null query) as it can be expensive/big and I don't need it.
Upvotes: 0
Views: 484
Reputation: 1856
As per your query link:[ComplexObject]
contains a complex object and have one to many relationships.
if it is complex you should go for a new collection referencing a root collection, because increase in size of single object of collection will slow down you query eventually. and of course it will be difficult to achieve what is desired.
Eg:
{
_id: ObjectId,
url: String,
title: String,
}
{
_id: ObjectId,
webpage_id: ObjectId // indexed
....
}
what type of index & query combination should I use to be able to query those documents where links field is not null ?
Answer: in new approach i don't think you have to check for null object.
but for older approach you can go for $exists
keyword in simple queries OR $ifNull pipeline
with $size pipeline
in aggregation.
index like compound index will help a lot in your structure.
Note: Do not store null value in property either $set
it or $unset
it, then $exists will work.
Eg.
db.collection.createIndex( { "links.property1": 1, "links.property2": 1 } );
webPages.find({ $exists: { link: true }});
https://docs.mongodb.com/manual/reference/operator/query/exists/
to check null values aggregation has a pipeline call $ifNull
webPages.aggregate([
{
$addFields: {
isExists: {
$ifNull: [ "$links", 0]
}
}
},
{
$match: {
// validate isExists
}
}
]);
webPages.aggregate([[ { $addFields: { arrayLength: { $size: "$links" } } }, { $match: { // validate arrayLength } } ]);
there are alot way to handle such stuff.
Upvotes: 0