Reputation: 13667
I've got a MongoDB collection where a particular string may appear in any of a number of fields:
{"_id":1, "field1": "foo", "field2": "bar", "field3": "baz", "otherfield": "stuff"},
{"_id":2, "field1": "bar", "field2": "baz", "field3": "foo", "otherfield": "morestuff"},
{"_id":3, "field1": "baz", "field2": "foo", "field3": "bar", "otherfield": "you get the idea"}
I need to query so that I am returned all records where any one of a set of fields is equal to any value in an array ... basically, if I have ["foo","bar"]
I need it to match if either of those strings are in field1 or field2 (but not any other field).
Obviously I can do this with a series of multiple queries
db.collection.find({"field1":{"$in":["foo","bar"]}})
db.collection.find({"field2":{"$in":["foo","bar"]}})
etc., and I've also made a very large $or query that concatenates them all together, but it seems far too inefficient (my actual collection needs to match any of 15 strings that can occur in any of 9 fields) ... but I'm still new to nosql DBs and am not sure of the best paradigm I need to use here. Any help is greatly appreciated.
Upvotes: 45
Views: 144809
Reputation: 13667
Found another answer through poring over the documentation that seems to hit a sweet spot -- text indexes.
db.collection.ensureIndex({"field1":"text","field2":"text"})
db.records.runCommand("text",{search:"foo bar"})
When I run my actual query with many more strings and fields (and about 100,000 records), the $or/$in
approach takes 620 milliseconds while the text index takes 131 milliseconds. The one drawback is that it returns a different type of document as a result; luckily the actual documents are a parameter of each result object.
Thanks to those who took the time to make suggestions.
Upvotes: 16
Reputation: 19464
I would collect all the relevant fields in one field (i.e. collected
) by adding their values like
"foo:field1",
"bar:field2",
"baz:field3",
"stuff:otherfield",
"bar:field1",
"baz:field2"
...
into that field.
If you search for bar
existing in any field you can use:
db.collection.find( { collected: { $regex: "^bar" } }, ... );
Your example in the question would look like:
db.collection.find( collected: { { $all: [ "foo:field1", "foo:field2", "bar:field1", "bar:field2" ] } }, ... );
Upvotes: 0