Reputation: 42645
I'm new to NoSQL, so sorry if this is very basic. Let's say I have the following collection:
{
a: 1,
b: 2,
c: 'x'
},
{
a: 1,
b: 2,
c: 'y'
},
{
a: 1,
b: 1,
c: 'y'
}
I would like to run a "Dedupe" query on anything that matches:
{
a: 1,
b: 2
... (any other properties are ignored) ...
},
So after the query is run, either of the following remaining in the collection would be fine:
{
a: 1,
b: 2,
c: 'y'
},
{
a: 1,
b: 1,
c: 'y'
}
OR
{
a: 1,
b: 2,
c: 'x'
},
{
a: 1,
b: 1,
c: 'y'
}
Just so long as there's only one document with a==1 and b==2 remaining.
Upvotes: 1
Views: 6912
Reputation: 1500
This answer hasn't been updated in a while. It took me a while to figure this out. First, using the Mongo CLI, connect to the database and create an index on the field which you want to be unique. Here is an example for users
with a unique email address:
db.users.createIndex({ "email": 1 }, { unique: true })
The 1
creates the index, along with the existing _id
index automatically created.
Now when you run create
or save
on an object, if that email exists, Mongoose will through a duplication error.
Upvotes: 1
Reputation: 230316
I don't know of any commands that will update your collection in-place, but you can certainly do it via temp storage.
a
and b
)tmp
. Discard the rest of the group.tmp
.You can do this with MapReduce or upcoming Aggregation Framework (currently in unstable branch).
I decided not to write code here as it would take the joy of learning away from you. :)
Upvotes: -1
Reputation: 26258
If you always want to ensure that only one document has any given a
, b
combination, you can use a unique index on a
and b
. When creating the index, you can give the dropDups
option, which will remove all but one duplicate:
db.collection.ensureIndex({a: 1, b: 1}, {unique: true, dropDups: true})
Upvotes: 7