Reputation: 2181
I have quite a large MongoDB collection (roughly 30 million documents), trying to get a maximum of a nested field nested.my_time
. Mongo version is 3.6.6. I've created an index on this field:
{
'my_index': {
'sparse': True,
'v': 2,
'background': True,
'key': [('nested.my_time', -1)],
'ns': 'my_db.my_table'
}
Connection in pymongo:
import pymongo
mclient = pymongo.MongoClient('mongodb://myuri...')
db = mclient['my_db']
my_table = db['my_table']
Queries I tried:
latest1 = my_table.find_one(
sort=[('nested.my_time', pymongo.DESCENDING)],
projection=['nested.my_time']
).hint('my_index')
.. doing a full scan, taking too long.
latest2 = my_table.aggregate([{
'$sort': {
'nested.my_time': pymongo.DESCENDING,
}},{
'$limit': 1
}]).hint('my_index')
.. doing a fullscan as well
latest3 = my_table.aggregate([{
'$group': {
'_id': None,
'latest': {
'$max': '$nested.my_time'
}
}
}]).hint('my_index')
.. doing a full scan too.
When I tried just getting a document with the given my_time
, it works and it's using the index:
foo = my_table.find(
filter={'nested.my_time': datetime(2019, 2, 4, 6, 57, 4, 534000)}
).limit(1)
.. so the index is clearly there and working. Any ideas how to make mongo use the index for max?
Upvotes: 3
Views: 1298
Reputation: 2094
As you have an index on nested.my_time a sort and limit should utilize this index. From the shell with explain executionStats:
db.<coll name>.find().sort({"nested.my_time": -1}).limit(1).explain(1)
or as aggregation without explain:
db.<coll name>.aggregate([{$sort: {"nested.my_time": -1}},{$limit: 1}])
Upvotes: 3