Reputation: 43
I have mongo collection 'orders' contains a list of user with orderid and time that looks like this:
user orderid time has_pending
10001 1 1510489123 0
10002 2 1510489125 0
10003 3 1510489127 0
10001 5 1510489131 1
10001 6 1510489133 1
10002 7 1510489135 0
10003 8 1510489137 0
10001 9 1510489139 1
10001 10 1510489141 0
10002 11 1510489143 1
10001 12 1510489145 0 <<<<<
10002 13 1510489147 0 <<<<<
10001 14 1510489149 1
10002 15 1510489151 1
10003 16 1510489153 1
10003 17 1510489155 1
10003 18 1510489157 1
10003 21 1510489163 1
10003 22 1510489165 0 <<<<<
I'm trying to get the list of orders per user where order time >= time of the last appearance of has_pending = 0
for example: if we look at user 10001 data:
user orderid time has_pending
10001 1 1510489123 0
10001 5 1510489131 1
10001 6 1510489133 1
10001 9 1510489139 1
10001 10 1510489141 0
10001 12 1510489145 0
10001 14 1510489149 1
so the result of this query for this user will be:
10001 12 1510489145 0
10001 14 1510489149 1
The required query should fetch the data for all users, and the result should be like this:
user orderid time has_pending
10001 12 1510489145 0
10002 13 1510489147 0
10001 14 1510489149 1
10002 15 1510489151 1
10003 22 1510489165 0
MYSQL QUERY:
SELECT
t1.*
FROM
test AS t1
LEFT JOIN test AS t2 ON t1.time >= t2.time AND t1.user = t2.user
WHERE
t2.orderid= (SELECT max(orderid) FROM test WHERE user= t1.user AND has_pending = 0)
Any ideas how can I get the result in one mongo query?
Thanks
Upvotes: 0
Views: 140
Reputation: 10918
Given the following input documents:
{ "user" : 10001, "orderid" : 1, "time" : 1510489123, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 2, "time" : 1510489125, "has_pending" : 0 }
{ "user" : 10003, "orderid" : 3, "time" : 1510489127, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 5, "time" : 1510489131, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 6, "time" : 1510489133, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 7, "time" : 1510489135, "has_pending" : 0 }
{ "user" : 10003, "orderid" : 8, "time" : 1510489137, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 9, "time" : 1510489139, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 10, "time" : 1510489141, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 11, "time" : 1510489143, "has_pending" : 1 }
{ "user" : 10001, "orderid" : 12, "time" : 1510489145, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 13, "time" : 1510489147, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 14, "time" : 1510489149, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 15, "time" : 1510489151, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 16, "time" : 1510489153, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 17, "time" : 1510489155, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 18, "time" : 1510489157, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 21, "time" : 1510489163, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 22, "time" : 1510489165, "has_pending" : 0 }
Your query would need to look like this:
db.collection.aggregate([
{
$sort: {
"time": -1 // sort by "time" descending
}
}, {
$group: { // we want to slice our data per "user" so let's group by that field
_id: "$user",
"orders": {
$push: "$$ROOT" // remember each document in an array per each "user" group (entries still sorted by "time" descending)
}
}
}, {
$project: {
"orders": { // our orders array shall only contain...
$slice: [ "$orders", 0, { // ...all items from the last one up until...
$add: [ { $indexOfArray: [ "$orders.has_pending", 0 ] }, 1 ] // ...the first appearance of an "has_pending" == 0 entry
// the $add makes sure that we include the found element with "has_pending" == 0, too
}]
}
}
}, {
$unwind: "$orders" // restore original documents again by flattening the "orders" array
}, {
$replaceRoot: { // move the (single) entry of the orders array to the root level of each document
"newRoot": "$orders"
}
}, {
$sort: {
"time": 1 // your example output was sorted by date so that's why we do that here, too...
}
}])
This will give you the exact order and contents that you asked for (plus the _id field which I've omitted for brevity):
{ "user" : 10001, "orderid" : 12, "time" : 1510489145, "has_pending" : 0 }
{ "user" : 10002, "orderid" : 13, "time" : 1510489147, "has_pending" : 0 }
{ "user" : 10001, "orderid" : 14, "time" : 1510489149, "has_pending" : 1 }
{ "user" : 10002, "orderid" : 15, "time" : 1510489151, "has_pending" : 1 }
{ "user" : 10003, "orderid" : 22, "time" : 1510489165, "has_pending" : 0 }
Upvotes: 2
Reputation: 3171
db.getCollection('order').aggregate([
{ $sort: {"time": -1}},
{
$group:{
_id: {
user: "$user",
has_pending: "$has_pending"
},
time: { $first: "$time"},
orderid: { $first: "$orderid"}
}
},
{
$project: {
_id: 0,
user: "$_id.user",
orderid: "$orderid",
time: "$time",
has_pending: "$_id.has_pending"
}
}
])
If you want to understand what you are doing in each aggregation pipeline you can continue reading.
To explain what happens in each pipeline I'll take a subset of what you posted. So lets say we have these documents:
user orderid time has_pending
10001 1 1510489123 0
10002 2 1510489125 0
10001 5 1510489131 1
10002 7 1510489135 0
10002 11 1510489143 1
10001 12 1510489145 0
10002 13 1510489147 0
10001 14 1510489149 1
10002 15 1510489151 1
By sorting by time { $sort: {"time": -1}}
you order your results by time descending order. That would make your results look like this
user orderid time has_pending
10002 15 1510489151 1
10001 14 1510489149 1
10002 13 1510489147 0
10001 12 1510489145 0
10002 11 1510489143 1
10002 7 1510489135 0
10001 5 1510489131 1
10002 2 1510489125 0
10001 1 1510489123 0
Now we can group our results by user
and has_pending
. Because we only need one result for each user
and each has_pending
. So we only need one
user: 1001 with has_pending: 0,
user: 1001 with has_pending: 1,
user: 1002 with has_pending: 0,
user: 1002 with has_pending: 1
this is happening inside your group aggregation:
_id: {
user: "$user",
has_pending: "$has_pending"
}
The field _id
in your group is mandatory and there you describe by what you want to group by.
Note I added:
time: { $first: "$time"},
orderid: { $first: "$orderid"}
I use $first
because I know that my documents are sorted. So I'm absolutely sure that the first for
user: 1001 with has_pending: 0 will take "time" : 1510489145 and "orderid" : 12
user: 1001 with has_pending: 1 will take "time" : 1510489149 and "orderid" : 14
user: 1002 with has_pending: 0 will take "time" : 1510489147 and "orderid" : 13
user: 1002 with has_pending: 1 will take "time" : 1510489151 and "orderid" : 15
The $project
in this case is only there to "normalize" your results. So we can have the end result that you asked for.
Upvotes: 0
Reputation: 3868
You can use the sort attribute. An example in the MongoDB shell with the same output as your SQL query:
db.collection.find({}).sort({ user: 1, orderid: 1, time: 1, has_pending: 1 }).pretty()
Upvotes: -1