Reputation: 651
I am using mongo server 3.6 and php as backend language. Also I am using the latest php-mongo library to talk to new updated drivers.
I have a collection with 10 million records in it like so :
[
{
"did": "123456",
"did_usage": "1",
"did_timestamp": "15012"
},
{
"did": "4567811",
"did_usage": "1",
"did_timestamp": "15013"
},
{
"did": "46465464",
"did_usage": "2",
"did_timestamp": "15014"
},
{
"did": "7894446",
"did_usage": "2",
"did_timestamp": "15015"
},
{
"did": "65646131",
"did_usage": "3",
"did_timestamp": "15016"
},
{
"did": "7989464",
"did_usage": "2",
"did_timestamp": "15017"
},
{
"did": "651651664",
"did_usage": "1",
"did_timestamp": "15018"
}.......
]
Now I want to find a single unique document which have least did usage and least did timestamp.
Till now I have reached to find the single unique document using :
$sample = array('$sample' => array('size' => 1));
$pipeline = array($match, $group, $project, $sample);
$cursor = $collection->aggregate($pipeline);
I want some help regarding the $group
.
I have tried this
$group = array('$group' => array('_id' => '$did_usage', 'did_usage_timestamp' => array('$min' => '$did_usage_timestamp')));
But this is not working as expected.
Upvotes: 2
Views: 1178
Reputation: 6922
Even without knowing what $match
and $project
are doing in your original code, we can assume that $group
is operating on a subset of the collection data since it is only preceded by $match
in the pipeline. Based on only the example collection data and the $group
stage, it's evident that the $dig_usage_timestamp
field path used with the $min
operator references a non-existent field in the documents entering the $group
stage.
When testing this locally, did_usage_timestamp
in the randomly selected output document is null
:
<?php
require 'vendor/autoload.php';
$client = new MongoDB\Client;
$collection = $client->test->foo;
$collection->drop();
$collection->insertMany([
["did" => "123456", "did_usage" => "1", "did_timestamp" => "15012"],
["did" => "4567811", "did_usage" => "1", "did_timestamp" => "15013"],
["did" => "46465464", "did_usage" => "2", "did_timestamp" => "15014"],
["did" => "7894446", "did_usage" => "2", "did_timestamp" => "15015"],
["did" => "65646131", "did_usage" => "3", "did_timestamp" => "15016"],
["did" => "7989464", "did_usage" => "2", "did_timestamp" => "15017"],
]);
$cursor = $collection->aggregate([
['$group' => ['_id' => '$did_usage', 'did_timestamp' => ['$min' => '$did_usage_timestamp']]],
['$sample' => ['size' => 1]],
]);
var_dump($cursor->toArray());
This outputs something similar to:
array(1) {
[0]=>
object(MongoDB\Model\BSONDocument)#14 (1) {
["storage":"ArrayObject":private]=>
array(2) {
["_id"]=>
string(1) "1"
["did_timestamp"]=>
NULL
}
}
}
Changing the $min
operator's field path to $did_timestamp
should resolve the issue.
Upvotes: 2