Reputation: 312
I have a mongo collection with millions of documents having same fields, For example,
{
"_id" : ObjectId("601ade833126047ee8f47182"),
"file_id" : "60110b7dad0cf20001adcbef",
"versions" : [
{
"local" : 6,
"s3" : "C71rczduuVOPpMohCpCeBQ3_NARDnTRj"
}
]
}
{
"_id" : ObjectId("60221d1039acf39e09fbfca5"),
"file__id" : "5fdb2eb4ad0cf20001f97856",
"versions" : [
{
"local" : 2,
"s3" : "aCy61Gx_UpTZfY59hNLYryGuWTJO2oPk"
}
]
}
{
"_id" : ObjectId("60221dc639acf39e09fbfca6"),
"file_id" : "5fe9c897a675f20001f0a82e",
"versions" : [
{
"local" : 3,
"s3" : "PHLnYjsRlg3GnEQ_UeDkhWIaJbFRmpw9"
}
]
}
{
"_id" : ObjectId("6050cbcd6b7aab2cd3958978"),
"file_id" : "6040ca06a675f2000115985e",
"versions" : [
{
"local" : 2,
"s3" : "vdFY22JFAzU.cD1Xr0eliuwt00rpJC8j"
}
]
}
My question is, if I give the command collection.find({"file_id": some_string})
, mongodb has to search the whole collection to find the document with "file_id" which I am searching for. Will Indexing "file_id" help to reduce the execution time?. In my case all the documents inside the collection will have the key "file_id". Will indexing really help in this case?.
Upvotes: 0
Views: 303
Reputation: 522712
You asked:
Will Indexing "file_id" help to reduce the execution time?
The answer is that, quite possibly, yes, adding an index to the file_id
field will dramatically speed up the find query you showed above. Just try it yourself to find out:
db.your_collection.createIndex( { "file_id": 1 } )
The above command will, by default, create a B-tree index using the file_id
field values. Going into depth about how a B-tree works might be out of scope for any single answer, but in summary if Mongo uses this index to search by file_id
it should perform as O(lgN)
, where N
is the number of BSON documents in your collection. On the other hand, running your query as-is, without any index, should result in a full collection scan, which should be a linear O(N)
operation. Note that this is exponentially slower than using the index.
Upvotes: 1