igorkf
igorkf

Reputation: 3565

Filter only documents that have ALL FIELDS non null (with aggregation framework)

I have many documents, but I want to figure out how to get only documents that have ALL FIELDS non null.
Suppose I have these documents:

[
 {
  'a': 1,
  'b': 2,
  'c': 3
 },
 {
  'a': 9,
  'b': 12
 },
 {
  'a': 5
 }
]

So filtering the documents, only the first have ALL FIELDS not null. So filtering out these documents, I would get only the first. How can I do this?

Upvotes: 1

Views: 579

Answers (1)

whoami - fakeFaceTrueSoul
whoami - fakeFaceTrueSoul

Reputation: 17915

So when you wanted to get only the documents which have ALL FIELDS, without specifying all of them in filter query like this : { a: {$exists : true}, b : {$exists : true}, c : {$exists : true}} then it might not be a good idea, in other way technically if you've 10s of fields in the document then it wouldn't either be a good idea to mention all of them in the query. Anyhow as you don't want to list them all - We can try this hack if it performs well, Let's say if you've a fixed schema & say that all of your documents may contain only fields a, b & c (_id is default & exceptional) but nothing apart from those try this :

If you can get count of total fields, We can check for field count which says all fields do exists, Something like below :

db.collection.aggregate([
  /** add a new field which counts no.of fields in the document */
  {
    $addFields: { count: { $size: { $objectToArray: "$$ROOT" } } }
  },
  {
    $match: { count: { $eq: 4 } } // we've 4 as 3 fields + _id
  },
  {
    $project: { count: 0 }
  }
])

Test : mongoplayground

Note : We're only checking for field existence but not checking for false values like null or [] or '' on fields. Also this might not work for nested fields.

Just in case if you wanted to check all fields exist in the document with their names, So if you can pass all fields names as input, then try below query :

db.collection.aggregate([
    /** create a field with all keys/field names in the document */
    {
      $addFields: {
        data: {
          $let: {
            vars: { data: { $objectToArray: "$$ROOT" } },
            in: "$$data.k"
          }
        }
      }
    },
    {
      $match: { data: { $all: [ "b", "c", "a" ] } } /** List down all the field names from schema */
    },
    {
      $project: { data: 0 }
    }
  ])

Test : mongoplayground

Ref : aggregation-pipeline

You can try to use explain to check your queries performance.

Upvotes: 1

Related Questions