Gibbs
Gibbs

Reputation: 22974

Results different documents for the same query

db.getCollection('rien').aggregate([
  {
    $match: {
      $and: [
        {
          "id": "10356"
        },
        {
          $or: [
            {
              "sys_date": {
                "$gte": newDate(ISODate().getTime()-90*24*60*60*1000)
              }
            },
            {
              "war_date": {
                "$gte": newDate(ISODate().getTime()-90*24*60*60*1000)
              }
            }
          ]
        }
      ]
    }
  },
  {
    $group: {
      "_id": "$b_id",
      count: {
        $sum: 1
      },
      ads: {
        $addToSet: {
          "s": "$s",
          "ca": "$ca"
        }
      },
      files: {
        $addToSet: {
          "system": "$system",
          "hostname": "$hostname"
        }
      }
    }
  },
  {
    $sort: {
      "ads.s": -1
    }
  },
  {
    $group: {
      "_id": "$b_id",
      total_count: {
        $sum: 1
      },
      "data": {
        "$push": "$$ROOT"
      }
    }
  },
  {
    $project: {
      "_id": 0,
      "total_count": 1,
      results: {
        $slice: [
          "$data",
          0,
          50
        ]
      }
    }
  }
])

When I execute this pipelines 5 times, it results in different set of documents. It is 3 node cluster. No sharding enabled. Have 10million documents. Data is static.

Any ideas about the inconsistent results? I feel I am missing some fundamentals here.

Upvotes: 1

Views: 219

Answers (1)

turivishal
turivishal

Reputation: 36154

I can see 2 problems,

  1. "ads.s": -1 will not work because, its an array field $sort will not apply in array field
  2. $addToSet will not maintain sort order even its ordered from previous stage,

You can use $setUnion operator for ascending order and $reduce for descending order result from $setUnion,

I workaround I am adding a solution below, I am not sure this is good option or not but you can use if this not affect performance of your query,

I am adding updated stages here only,

  • remain same
  { $match: {} }, // skipped
  { $group: {} }, // skipped
  • $sort, optional its up to your requirement if you want order by main document
  { $sort: { _id: -1 } },
  • $setUnion, treating arrays as sets. If an array contains duplicate entries, $setUnion ignores the duplicate entries, and second its return array in ascending order on the base of first field that we specified in $group stage is s, but make sure all element in array have s as first field,
  • $reduce to iterate loop of array and concat arrays current element $$this and initial value $$value, this will change order of array in descending order,
  {
    $addFields: {
      ads: {
        $reduce: {
          input: { $setUnion: "$ads" },
          initialValue: [],
          in: { $concatArrays: [["$$this"], "$$value"] }
        }
      },
      files: {
        $reduce: {
          input: { $setUnion: "$files" },
          initialValue: [],
          in: { $concatArrays: [["$$this"], "$$value"] }
        }
      }
    }
  },
  • remain same
  { $group: {} }, // skipped
  { $project: {} } // skipped

Playground


$setUnion mentioned in documentation: The order of the elements in the output array is unspecified., but I have tested every way its returning in ascending order perfectly, why I don't know,

I have asked question in MongoDB Developer Forum does-setunion-expression-operator-order-array-elements-in-ascending-order?, they replied => it will not guarantee of order!

Upvotes: 1

Related Questions