jack
jack

Reputation: 81

Mongodb -how to find records that contain certain keywords array

Recently I wanted to filter out records that contain a certain keyword array in MongoDB, for example: I have five records that contain keywords array:

{a:[1,2]}
{a:[1,3,8]}
{a:[1,2,5]}
{a:[3,5,1]}
{a:[4,5]}

If I input the array [1,2,3,5] for search, then I want to get:

{a:[1,2]}
{a:[1,2,5]}
{a:[3,5,1]}

Each of them is a sub array of [1,2,3,5].

Any idea?

Please don't use a where clause (when possbile). Thanks!

Upvotes: 8

Views: 3458

Answers (2)

Sony Santos
Sony Santos

Reputation: 5545

If I understood, you want to return only the objects whose all values of property a are in the find array argument.

By following the Travis' suggestion in the comments, you must follow these steps:

  1. Define a JS function to achieve your desires (since there's no native way to do that in MongoDB);
  2. Save the function on the server;
  3. Use the function within $where.

If define your function to use only to that specific property (a, in this case), you may want skip the step 2. However, since it can be an useful function for other properties of other documents, I defined a more generic function, which must to be save on the server to be used AFAIK (I'm new on Mongo, too).

Below there are my tests on the mongo shell:

<--! language: lang-js -->

// step 1: defining the function for your specific search
only = function(property, values) { 
  for(var i in property) if (values.indexOf(property[i]) < 0) return false
  return true
}

// step 2: saving it on the server
db.system.js.save( { _id : 'only', value : only } )

// step 3: using the function with $where
db.coll.find({$where: "only(this.a, [1,2,3,5])"})

With the 5 objects you provided on the question, you will obtain:

{ "_id" : ObjectId("4f3838f85594f902212eb532"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f3839075594f902212eb534"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f38390e5594f902212eb535"), "a" : [ 3, 5, 1 ] }

The downside is performance. See more.

Upvotes: 0

RameshVel
RameshVel

Reputation: 65877

Its simple to do in mongodb, but the harder part is preparing the data for the query. Let me explain that in oder

Simple part

You can use $in to find the matching elements in an array. Let us try

db.coll.find({a:{$in:[1,2,3,5]})

and the result is

{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42439ed13aa728e9efc"), "a" : [ 1, 3, 8 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }
{ "_id" : ObjectId("4f37c43439ed13aa728e9efe"), "a" : [ 3, 5, 1 ] }
{ "_id" : ObjectId("4f37c43e39ed13aa728e9eff"), "a" : [ 4, 5 ] }

ohh, its not the result we expected. Yes because $in return an item if any matching element found (not necessarily all).

So we can fix this by passing the exact array elements to $in, for example if we want to find the items matching these exact arrays {a:[1,2]} {a:[1,2,5]} and {a:[4,5,6]}

db.coll.find({a:{$in:[[1,2],[1,2,5],[4,5,6]]}})

you will get

 { "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
 { "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }

Thats all

Hardest part

The real hardest part is forming all the possible combination of your input array [1,2,3,5]. You need to find a way to get all the combination of the source array (from your client) and pass it to $in.

For example, this JS method will give you all the combinations of the given array

var combine = function(a) {
  var fn = function(n, src, got, all) {
    if (n == 0) {
      if (got.length > 0) {
        all[all.length] = got;
      }
      return;
    }
    for (var j = 0; j < src.length; j++) {
      fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
    }
    return;
  }
  var all = [];
  for (var i=0; i < a.length; i++) {
    fn(i, a, [], all);
  }
  all.push(a);
  return all;
}

>> arr= combine([1,2,3,5])

will give you

[
    [
        1
    ],
    [
        2
    ],
    [
        3
    ],
    [
        5
    ],
    [
        1,
        2
    ],
    [
        1,
        3
    ],
    [
        1,
        5
    ],
    [
        2,
        3
    ],
    [
        2,
        5
    ],
    [
        3,
        5
    ],
    [
        1,
        2,
        3
    ],
    [
        1,
        2,
        5
    ],
    [
        1,
        3,
        5
    ],
    [
        2,
        3,
        5
    ],
    [
        1,
        2,
        3,
        5
    ]
]

and you can pass this arr to $in to find all the macthing elements

     db.coll.find({a:{$in:arr}})

will give you

{ "_id" : ObjectId("4f37c41739ed13aa728e9efb"), "a" : [ 1, 2 ] }
{ "_id" : ObjectId("4f37c42c39ed13aa728e9efd"), "a" : [ 1, 2, 5 ] }

Wait!, its still not returning the remaining two possible items.

Because have a good look at the arr, it finds only the combination. it returns [1,3,5] but the data in document is [3,5,1]. So its clear that $in checks the items in given order (weird!).

So now you understand its the really hard comparing the mongodb query!. You can change the above JS combination former code to find the possible permutation to each combination and pass it to mongodb $in. Thats the trick.

Since you didn't mention any language choice its hard to recommend any permutation code. But you can find lot of different approaches in Stackoverflow or googling.

Upvotes: 9

Related Questions