Reputation: 33648
Please look at the following lines of code and the results:
import pymongo
d1 = {'p': 0.5, 'theta': 100, 'sigma': 20}
d2 = {'theta': 100, 'sigma': 20, 'p': 0.5}
I get the following results:
d1 == d2 // Returns True
collectn.find({'goods.H': d1}).count() // Returns 33
collectn.find({'goods.H': d2}).count() // Returns 2
where, collectn
is a Mongodb collections object.
Is there a setting or a way to query so that I obtain the same results for the above two queries?
They are essentially using the same dictionary (in
the sense of d1 == d2
being True
). I am trying to do the following:
before inserting a record into the database I check whether there
already exists a record with the exact value combination that is being added.
If so, then I don't want to make a new record. But because of the above
shown behavior it becomes possible to get that the record does not exist even
when it does and a duplicate record is added to the database (of course, with different _id
but all other values are the same, and I would prefer not to have that).
Thank you in advance for your help.
Upvotes: 10
Views: 1328
Reputation: 16705
I think you're looking for the $where operator.
This works in Node:
var myCursor = coll.find({$where: function () {return obj.goods.H == d1}});
myCursor.count(function (err, myCount) {console.log(myCount)});
In Python I believe you'll need to pass in a BSON code object.
The documentation warns that the $where operator should be used as a last resort since it comes with a performance penalty, and can't use indexes.
It seems like it may be worthwhile to establish an ordering of the sub properties, and enforce it if possible on insert or as a post process.
Upvotes: 0
Reputation: 4021
The issue you are having is explained in the mongodb documentation here. It also has to do with the fact that Python dictionaries are unordered and MongoDB objects are ordered BSON objects.
The relevant quote being,
Equality matches within subdocuments select documents if the subdocument matches exactly the specified subdocument, including the field order.
I think you might be better off if you treat all three properties as subproperties of the main object instead of one collection of properties that is the subobject. That way the ordering of the subobject is not forced into the query by the python interpreter.
For instance...
d1 = {'goods.H.p': 0.5, 'goods.H.theta': 100, 'goods.H.sigma': 20}
d2 = {'goods.H.theta': 100, 'goods.H.sigma': 20, 'goods.H.p': 0.5}
collectn.find(d1).count()
collectn.find(d2).count()
...may yield more consistent results.
Finally, a way to do it changing less code:
collectn.find({'goods.H.' + k:v for k,v in d1.items()})
collectn.find({'goods.H.' + k:v for k,v in d2.items()})
Upvotes: 6
Reputation: 2084
I think your problem is mentioned in mongodb doc:
The field must match the sub-document exactly, including order....
look at documentation here. There is example with sub-document.
Fields in sub-document have to be in the same order as in query to be matched.
Upvotes: 0
Reputation: 1500
I can only think of two things to do:
Structure your query as this: collectn.find({'goods.H.p':0.5, 'goods.H.theta':100, 'goods.H.sigma':
20}).count(). That will find the correct number of documents...
Restructure your data -> if you look at MongoDB : Indexes order and query order must match? you will that you can index on p,sigma,theta so that when, in the query, any order of the terms will provide the correct result. In my brief tests (I am no expert) I was not able to index in a way that produces that same effect with your current structure.
Upvotes: 1