Curious2learn
Curious2learn

Reputation: 33648

Pymongo or Mongodb is treating two equal python dictionaries as different objects. Can I force them to be treated the same?

Please look at the following lines of code and the results:

import pymongo

d1 = {'p': 0.5, 'theta': 100, 'sigma': 20}
d2 = {'theta': 100, 'sigma': 20, 'p': 0.5}

I get the following results:

d1 == d2 // Returns True

collectn.find({'goods.H': d1}).count() // Returns 33

collectn.find({'goods.H': d2}).count() // Returns 2

where, collectn is a Mongodb collections object.

Is there a setting or a way to query so that I obtain the same results for the above two queries?

They are essentially using the same dictionary (in the sense of d1 == d2 being True). I am trying to do the following: before inserting a record into the database I check whether there already exists a record with the exact value combination that is being added. If so, then I don't want to make a new record. But because of the above shown behavior it becomes possible to get that the record does not exist even when it does and a duplicate record is added to the database (of course, with different _id but all other values are the same, and I would prefer not to have that).

Thank you in advance for your help.

Upvotes: 10

Views: 1328

Answers (4)

mjhm
mjhm

Reputation: 16705

I think you're looking for the $where operator.

This works in Node:

var myCursor = coll.find({$where: function () {return obj.goods.H == d1}});
myCursor.count(function (err, myCount) {console.log(myCount)});

In Python I believe you'll need to pass in a BSON code object.

The documentation warns that the $where operator should be used as a last resort since it comes with a performance penalty, and can't use indexes.

It seems like it may be worthwhile to establish an ordering of the sub properties, and enforce it if possible on insert or as a post process.

Upvotes: 0

mayhewr
mayhewr

Reputation: 4021

The issue you are having is explained in the mongodb documentation here. It also has to do with the fact that Python dictionaries are unordered and MongoDB objects are ordered BSON objects.

The relevant quote being,

Equality matches within subdocuments select documents if the subdocument matches exactly the specified subdocument, including the field order.

I think you might be better off if you treat all three properties as subproperties of the main object instead of one collection of properties that is the subobject. That way the ordering of the subobject is not forced into the query by the python interpreter.

For instance...

d1 = {'goods.H.p': 0.5, 'goods.H.theta': 100, 'goods.H.sigma': 20}
d2 = {'goods.H.theta': 100, 'goods.H.sigma': 20, 'goods.H.p': 0.5}

collectn.find(d1).count()
collectn.find(d2).count()

...may yield more consistent results.

Finally, a way to do it changing less code:

collectn.find({'goods.H.' + k:v for k,v in d1.items()})
collectn.find({'goods.H.' + k:v for k,v in d2.items()})

Upvotes: 6

Michal
Michal

Reputation: 2084

I think your problem is mentioned in mongodb doc:

The field must match the sub-document exactly, including order....

look at documentation here. There is example with sub-document.

Fields in sub-document have to be in the same order as in query to be matched.

Upvotes: 0

IamAlexAlright
IamAlexAlright

Reputation: 1500

I can only think of two things to do:

  1. Structure your query as this: collectn.find({'goods.H.p':0.5, 'goods.H.theta':100, 'goods.H.sigma':20}).count(). That will find the correct number of documents...

  2. Restructure your data -> if you look at MongoDB : Indexes order and query order must match? you will that you can index on p,sigma,theta so that when, in the query, any order of the terms will provide the correct result. In my brief tests (I am no expert) I was not able to index in a way that produces that same effect with your current structure.

Upvotes: 1

Related Questions