Reputation: 4477
I have a query using pymongo that is outputting some values based on the following:
cursor = db.collect.find({"index_field":{"$regex":'\s'}}
for document in cursor:
print document["_id"]
Now this query has been running for a long time (over 500 million documents) as I expected. I was wondering though if there is a way to check where the query is in its execution by perhaps finding out where the last printed "_id" is in the indexed field. Like is the last printed _id halfway through the btree index? Is it near the end?
I want to know this just to see if I should cancel the query and reoptimize and/or let it finish, but I have no way of knowing where the _id exists in the query.
Also, if anyone has a way to optimize my whitespace query, that would be helpful to. Based on the doc, it seems if I would of used ignorecase it would of been faster, although it doesn't make sense for whitespace checking.
Thanks so much, J
Upvotes: 0
Views: 121
Reputation: 877
Your query cannot be optimized, because it's an inefficient$regex
search that's looking for the space \s
in the the document. What you can do, is to search $regex
for a prefix of \s
, e.g.
db.collect.find({"index_field": {"$regex": '^\\s'}})
Check out the notes in the link
$regex
can only use an index efficiently when the regular expression has an anchor for the beginning (i.e. ^) of a string and is a case-sensitive match. Additionally, while/^a/
,/^a.*/
, and/^a.*$/
match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however,/^a.*/
, and/^a.*$/
are slower./^a/
can stop scanning after matching the prefix.
Use db.currentOp()
to get info on all of your running ops.
Upvotes: 1