Reputation: 10251
I am trying to count word usage using MongoDB. My collection currently looks like this:
{'_id':###, 'username':'Foo', words:[{'word':'foo', 'count':1}, {'word':'bar', 'count':1}]}
When a new post is made, I extract all the new words to an array but I'm trying to figure out to upsert to the words array and increment the count if the word already exists.
In the example above, for example, if the user "Foo" posted "lorem ipsum foo", I'd add "lorem" and "ipsum" to the users words array but increment the count for "foo".
Is this possible in one query? Currently I am using addToSet:
'$addToSet':{'words':{'$each':word_array}}
But that doesn't seem to offer any way of increasing the words count.
Would very much appreciate some help :)
Upvotes: 2
Views: 8411
Reputation: 997
If you're willing to switch from a list to hash (object), you can atomically do this.
From the docs: "$inc
... increments field by the number value if field is present in the object, otherwise sets field to the number value."
{ $inc : { field : value } }
So, if you could refactor your container and object:
words: [
{
'word': 'foo',
'count': 1
},
...
]
to:
words: {
'foo': 1,
'other_word: 2,
...
}
you could use the operation update
with:
{ $inc: { 'words.foo': 1 } }
which would create { 'foo': 1 }
if 'foo' doesn't exist, else increment foo.
E.g.:
$ db.bar.insert({ id: 1, words: {} });
$ db.bar.find({ id: 1 })
[
{ ..., "words" : { }, "id" : 1 }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 1 } }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 2 } }
]
Upvotes: 11
Reputation: 18595
Unfortunately it is not possible to do this in a single update with your schema. Your schema is a bit questionable and should probably be converted to having a dedicated collection with word counters, e.g :
db.users {_id:###, username:'Foo'}
db.words.counters {_id:###, word:'Word', userId: ###, count: 1}
That will avoid quite a few issues such as :
Both scenarios require two updates to do what you want which introduces atomicity issues. Updating per word by looping through word_array is better and safer (and is possible with both solutions).
Upvotes: 2