mbednarski
mbednarski

Reputation: 798

Selecting data from MongoDB where K of N criterias are met

I have documents with four fields: A, B, C, D Now I need to find documents where at least three fields matches. For example:

Query: A=a, B=b, C=c, D=d

Returned documents:

  1. a,b,c,d (four of four met)
  2. a,b,c (three of four met)
  3. a,b,d (another three of four met)
  4. a,c,d (another three of four met)
  5. b,c,d (another three of four met)

So far I created something like:

`(A=a AND B=b AND C=c)
 OR (A=a AND B=b AND D=d)
 OR (A=a AND C=c AND D=d)
 OR (B=b AND C=c AND D=d)`

But this is ugly and error prone.

Is there a better way to achieve it? Also, query performance matters.

I'm using Spring Data but I believe it does not matter. My current code:

    Criteria c = new Criteria();

    Criteria ca = Criteria.where("A").is(doc.getA());
    Criteria cb = Criteria.where("B").is(doc.getB());
    Criteria cc = Criteria.where("C").is(doc.getC());
    Criteria cd = Criteria.where("D").is(doc.getD());

    c.orOperator(
            new Criteria().andOperator(ca,cb,cc),
            new Criteria().andOperator(ca,cb,cd),
            new Criteria().andOperator(ca,cc,cd),
            new Criteria().andOperator(cb,cc,cd)
    );
    Query query = new Query(c);

    return operations.find(query, Document.class, "documents");

Upvotes: 3

Views: 236

Answers (2)

Amin J
Amin J

Reputation: 1209

The way you have it you have to do all permutations in your query. You can use the aggregation framework to do this without permuting all combinations. And it is generic enough to do with any K. The downside is I think you need Mongodb 3.2+ and also Spring Data doesn't support these oparations yet: $filter $concatArrays

But you can do it pretty easy with the java driver.

[  
   {  
      $project:{  
         totalMatched:{  
            $size:{  
               $filter:{  
                  input:{  
                     $concatArrays:[ ["$A"], ["$B"], ["$C"],["$D"]]
                  },
                  as:"attr",
                  cond:{  
                     $eq:["$$attr","a"]
                  }
               }
            }
         }
      }
   },
   {  
      $match:{  
         totalMatched:{ $gte:3 }
      }
   }
]

All you are doing is you are concatenating the values of all the fields you need to check in a single array. Then select a subset of those elements that are equal to the value you are looking for (or any condition you want for that matter) and finally getting the size of that array for each document.

Now all you need to do is to $match the documents that have a size of greater than or equal to what you want.

Upvotes: 1

Clement Amarnath
Clement Amarnath

Reputation: 5466

Currently in MongoDB we cannot do this directly, since we dont have any functionality supporting Permutation/Combination on the query parameters.

But we can simplify the query by breaking the condition into parts.

Use Aggregation pipeline

$project with records (A=a AND B=b) --> This will give the records which are having two conditions matching.(Our objective is to find the records which are having matches for 3 out of 4 or 4 out of 4 on the given condition)`

Next in the pipeline use OR condition (C=c OR D=d) to  find the final set of records which yields our expected result.

Hope it Helps!

Upvotes: 2

Related Questions